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Preface 


Science is only worth doing if it is interesting and fun. Hence 
the goal of a textbook is to interest students in a subject, con¬ 
vince them it is worth the effort required to learn about it, and 
help them do so. We have tried here to do all three. 

For seismology, these should be easy. It is hard to imagine 
topics more interesting than the structure and evolution of a 
planet, as manifested by phenomena as dramatic as earth¬ 
quakes. Our goal is to address them via an introduction to 
seismology, which is one of the cornerstones of the modern 
earth sciences. Seismology has been defined as the study of 
earthquakes and associated phenomena, or the study of elastic 
waves propagating in the earth. By integrating techniques and 
data from physics, mathematics, and geology, seismology has 
produced a remarkably sharp picture of the earth’s interior 
that is a primary datum for studying the formation and evolu¬ 
tion of terrestrial planets. Seismologists have also learned much 
about the nature of earthquakes and the tectonic processes 
responsible for them. These studies are not of purely academic 
interest; seismology is the major tool for earthquake hazard 
assessment, hydrocarbon exploration, and the peacekeeping 
role of nuclear test monitoring. 

We thus believe that seismology should be part of the educa¬ 
tion of every solid earth scientist, rather than a specialized 
course for those whose primary interest is seismology or other 
branches of geophysics. The subject has much to offer miner¬ 
alogists or petrologists studying the composition of the earth’s 
interior, students of tectonics interested in processes of the 
lithosphere, geologists interested in the nature and evolution 
of the crust, engineers concerned with seismic hazards, and 
planetologists interested in the evolution of the terrestrial plan¬ 
ets. As the earth sciences become increasingly more integrated 
and interdisciplinary, the advantages of understanding seismo¬ 
logy will continue to grow. 

Many students have been deterred from the subject because 
it requires confronting, often for the first time, both the physics 
of a continuous medium and wave propagation. We view these 
concerns as manageable. In fact, we believe that seismology is 
a good way to introduce these topics, because it applies what 
might otherwise seem abstract ideas. Seismic waves illustrate 
effects like reflection, refraction, diffraction, and dispersion 
by using them to study the earth. Earthquakes demonstrate 


concepts like rigid tectonic plates, stress and strain, and viscous 
mantle flow. Thus seismology is a natural way to discuss funda¬ 
mental processes. 

Our goal is to introduce key concepts and their application in 
present research. This twofold goal places several limitations 
on the text. First, time and space restrictions require a trade-off 
between the range of topics and the level of presentation. The 
resulting choices are, of necessity, subjective. Second, we end 
discussions when material, however fascinating, seems more 
appropriate for advanced classes or courses in a related field. 1 
Third, these limitations preclude an account of the historical 
development of the subject, or a systematic assignment of 
credit for ideas and results. Fourth, in introducing topics of cur¬ 
rent research, we try to give our sense of issues while recogniz¬ 
ing that others’ views may differ. The danger in presenting the 
“current state of knowledge” in a text is that the field changes 
so rapidly that accounts can soon be out of date. We thus try to 
focus not on “what we know,” but on “how we seek to find 
out,” and highlight current findings in the context of studying 
interesting questions. 

Given these limitations, suggestions for further reading are 
provided. When possible, the readings are texts or reviews 
rather than specialized research papers. In many cases, the 
sources of the figures used to illustrate a concept provide 
additional information. We also give some references to sites 
on the World Wide Web, recognizing the trade-off between the 
wealth of information there and the fact that the Web is volatile 
and sites can change locations or vanish. 

The material is designed for advanced undergraduates and 
first-year graduate students. Readers are assumed to be fam¬ 
iliar with ordinary differential equations and introductory 
physics. Further background, including basic earth science 
courses, is helpful but not essential. Material beyond this level 
is derived as needed. Thus, we seek a balance between present¬ 
ing the mathematics like magic pulled from a hat and deriving 
so much so that the thematic flow is disrupted. Hence we 

1 Because subfields in the earth sciences overlap, the divisions between them are 
not sharp, and a given topic draws on several. As John Muir, an early member of the 
Seismological Society of America better known for founding the Sierra Club, pointed 
out, “when we look at anything in isolation we realize it is hitched to the rest of the 
universe.” 
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review some useful mathematics in an Appendix, to which we 
refer. Other mathematical concepts, notably topics in Fourier 
analysis, are used as needed and then presented in more depth 
when appropriate. 

Our goal is to introduce some concepts about seismology 
and its application to such studies of earth structure and earth¬ 
quakes. Doing this requires developing basic ideas about wave 
propagation in a continuous solid medium, so the material of 
greatest interest to geologically oriented readers is somewhat 
postponed. Readers are urged to enjoy rather than endure the 
introductory material on elasticity and wave propagation. They 
risk only discovering the appeal of these topics and finding 
themselves taking subsequent advanced courses. 

Part of the delights of the earth sciences is that they are less 
structured than some other sciences. There is no single set of 
topics covered in specific courses, which instead reflect the 
instructor’s and students’ interests. Certainly this is the case 
here. The topics we have chosen contain about a year’s worth 
of class material, which we ourselves divide into several 
courses. Many students, of course, take only one. We have 
experimented with different groupings, all of which seemed to 
work well. We usually do not cover the Appendix in lectures, 
but assign its problems to identify areas for study or review. 

We have found that the homework problems are helpful 
for understanding the topics. Given the nature of the modem 
earth sciences, many problems are designed to be done on com¬ 
puters. In our teaching, we expect that most will be done by 
writing programs, and hence require programming, beginning 
with simple problems in the Appendix and building to more 
complex ones in the chapters. A secondary motive is to ensure 
that students learn the skills of scientific programming, which 
are often not stressed in computer classes. Some of the prob¬ 


lems can be done using spreadsheets, and most can be done 
with specialized mathematical software. 

Some matters of style are worth mentioning. We illustrate 
interconnections between topics by referring both forward and 
backward to other sections. Figures are labeled with hyphens 
(e.g. 5.6-2), and equations with periods (e.g. 5.3.2). Footnotes 
generally cover side observations which we note in class but are 
not essential. We use both SI units (those based on the meter, 
kilogram, and second) and cgs units (those based on the 
centimeter, gram, and second) because both are common in the 
literature, although SI units are slowly superseding cgs. We also 
use other units when customary: seismic velocities are given 
in km/s and plate motions are given in the more intuitive 
mm/yr (e.g., 48 mm/yr rather than 1.5 x 10~ 9 m/s), following 
Emerson’s dictum that “a foolish consistency is the hobgoblin 
of little minds. ” 

We have enjoyed writing this book. It is a pleasure to try to 
summarize this diverse and fascinating discipline. We hope 
readers have as much fun as we did, and that our discussions 
prompt them to raise interesting and provocative questions as 
well as learn the material. We also hope that some readers are 
motivated to continue study of and research on these topics. 
Much remains to be learned about the earth and earthquake 
processes, and the opportunities for contributions are great 
for those with the energy and imagination to go beyond our 
current knowledge and ideas. Three hundred years after Isaac 
Newton’s work in mechanics and optics laid what would 
become seismology s foundations, it is worth recalling his 
words: I seem to have been only like a boy playing on the 
seashore, and diverting myself in now and then finding a 
smoother pebble or a prettier shell than ordinary, whilst the 
great ocean of truth lay all still undiscovered before me.” 
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Introduction 



I cannot help feeling that seismology will stay in the place at the center of solid earth science for many, many years to come. 

The joy of being a seismologist comes to you, when you find something new about the earth’s interior from the observation of 
seismic waves obtained on the surface, and realize that you did it without penetrating the earth or touching or examining it directly. 

Keiiti Aki, presidential address to the Seismological Society of America, 1980 


1.1 Introduction 

This book is an introduction to seismology, the study of elastic 
waves or sound waves in the solid earth. Conceptually, the sub¬ 
ject is simple. Seismic waves are generated at a source , which 
can be natural, such as an earthquake, or artificial, such as an 
explosion. The resulting waves propagate through the me¬ 
dium, some portion of the earth, and are recorded at a receiver 
(Fig. 1.1-1). A seismogram , the record of the motion of the 
ground at a receiver called a seismometer , thus contains infor¬ 
mation about both the source and the medium. This informa¬ 
tion can take several forms. The waves provide information on 
the location and nature of the source that generated them. If 
the origin time when the waves left the source is known, their 
arrival time at the receiver gives the travel time required to pass 
through the medium, and hence information about the speed 
at which they traveled, and thus the physical properties of the 
medium. In addition, because the amplitude and shape of the 
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Fig. 1.1-1 Schematic geometry of a seismic experiment. 


wave pulses that left the source are affected by propagation 
through the medium, the signals observed on seismograms 
provide additional information about the medium. 

1.1.1 Overview 

Before embarking on our studies, it is worth briefly outlining 
some of the ways in which seismology is used to study the 
earth, and some of the methods used. Seismology is the prim¬ 
ary tool for the study of the earth’s interior because little of 
the planet is accessible to direct observation. The surface can 
be mapped and explored, and drilling has penetrated to depths 
of up to 13 kilometers, though at great expense. Information 
about deeper depths, down to the center of the earth (approx¬ 
imately 6371 km), is obtained primarily from indirect methods. 
Seismology, the most powerful such method, is used to map the 
earth’s interior and study the distribution of physical proper¬ 
ties. The existence of the earth’s shallow crust, deeper mantle, 
liquid outer core, and solid inner core are inferred from varia¬ 
tions in seismic velocity with depth. Our ideas about their 
chemical compositions, including the presumed locations of 
changes in mineral structure due to the increase of pressure 
with depth, are also based on seismological data. Near the 
surface, seismology provides detailed crustal images that reveal 
information about the locations of economic resources like 
oil and minerals. Deeper in the earth, seismology provides 
the basic data for understanding earth’s dynamic history and 
evolution, including the process of mantle convection. 

Seismology is also the primary method for studies of earth¬ 
quakes. Most of the information about the nature of faulting 
during an earthquake is determined from the resulting seismo¬ 
grams. These observations are useful for several purposes. 
Because earthquakes generally result from the motions of the 
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plates making up the earth’s lithosphere, which are the sur¬ 
face expression of convection within earth’s mantle, know¬ 
ledge of the direction and amount of motion is valuable for 
describing plate motions and the forces giving rise to them. 
Analysis of seismograms also makes it possible to investigate 
the physical processes that occur prior to, during, and after 
faulting. Such studies are helpful in assessing the societal 
hazards posed by earthquakes. 

Our purpose here is to discuss some basic ideas about 
seismology and its applications. To do this, we first introduce 
several concepts about waves in a solid medium. We will see 
that a few simple but powerful ideas give a great deal of insight 
into how waves propagate and respond to variations in phys¬ 
ical properties in the earth. Fortunately, most of these ideas are 
analogous to familiar concepts in the propagation of light 
and sound waves. As a result, studying the earth with seismic 
waves is conceptually similar to sensing the world around us 
using light and sound. For example, you are reading this by 
receiving light reflected off the paper. We see color because 
light has different wavelengths; the sky is blue because certain 
wavelengths are scattered preferentially. An even closer ana¬ 
logy is the use of sound waves by bats, dolphins, and subma¬ 
rines to “see” their surroundings. Seismology gives detailed im¬ 
ages of earth structure, much as sound waves (ultrasound) and 
electromagnetic waves (X-rays) are used in medicine to study 
human bodies. 

A familiar property of light is that it bends when traveling be¬ 
tween materials in which its speed differs. Objects inserted into 
water appear crooked, because light waves travel more slowly 
in water than in air. Prisms and lenses use this effect, called re¬ 
fraction. This phenomenon occurs in the earth because seismic 
wave velocities generally increase with depth. Wave paths bend 
away from the vertical as they go deeper into the earth, eventu¬ 
ally become horizontal (“bottom”), turn upward, and return to 
the surface (Fig. 1.1-2). The wave paths are thus used to infer 
the variation of seismic velocity, and hence the composition 
and physical properties of material, with depth in the earth. 



Fig. 1.1-2 Seismic ray paths in the earth, showing the effect of an increase 
in seismic velocity with increasing depth. The waves travel in curved paths 
between the earthquake and seismic stations. 

Just as light waves reflect at a mirror, seismic waves reflect at 
interfaces across which physical properties change, such as the 
boundary between the earth’s mantle and core. Because the 
amplitudes of the reflected and transmitted seismic waves de¬ 
pend on the velocities and densities of the material on either 
side of the boundary, analysis of seismic waves yields informa¬ 
tion on the nature of the interface. In addition to refraction and 
reflection, waves also undergo diffraction . Just as sound dif¬ 
fracts around the corner of a building, allowing us to hear what 
we cannot see, seismic waves bend around “obstacles” such as 
the earth’s core. 

The basic data for these studies are seismograms, records of 
the motion of the ground resulting from the arrival of refracted, 
reflected, and diffracted seismic waves. Seismograms incor¬ 
porate precise timing, so that travel times can be determined. 
The seismometer’s response is known, so the seismogram can 
be related to the actual ground motion. Because ground motion 
is a vector, three different components (north-south, east- 
west, and up-down) are typically recorded. Hence, although 
seismograms at first appear to be simply wiggly lines, they 
contain interesting and useful information. 

To illustrate the use of seismology for the study of earth 
structure, consider a seismogram from a magnitude 6 earth¬ 
quake in Colombia, recorded about 4900 kilometers away in 
Colorado (Fig. 1.1-3). Several seismic wave arrivals, called 
phases , are identified using a simple nomenclature that de¬ 
scribes the path each followed from the source to the receiver. 



Fig. 1.1-3 Left : Long-period vertical component seismogram at Golden, Colorado, from an earthquake in Colombia (July 29,1967), showing various 
seismic phases. The distance from earthquake to station is 44°. Right : Ray paths for the seismic phases labeled on the seismogram. 
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Fig. 1.1-4 Seismogram {left) and ray paths {right) for a deep focus earthquake 


We will see that seismic waves are divided into two types. In 
one type, P or compressional waves, material moves back and 
forth in the direction in which the wave propagates. In the 
other, S or shear waves, material moves at right angles to the 
propagation direction. P waves travel faster than S waves, so 
the first arriving pulse, labeled “P,” is a P wave that followed a 
direct path from the earthquake to the seismometer. 1 Soon 
afterwards, a pulse labeled pP appears, which went upward 
from the earthquake, reflected off the earth’s surface, and 
then traveled to the seismometer as a P wave. If the distribu- 
tion of seismic velocity near the source is known, the depth 
of the earthquake below the earth’s surface can be found 
from the time difference between the direct P and pP phases, 
because the primary differences between their ray paths are the 
pP segments that first go up to and then reflect off the surface. 
The phase marked PP is a compressional wave that went down- 
ward from the source, “bottomed,” reflected at the surface, 
and repeated the process. Among the later arrivals on the 
seismogram are shear wave phases, including the direct shear 
wave arrival, S , and a shear phase SS that reflected off the 
surface, analogous to PP. All these phases, which traveled 
through the earth’s interior, are known as body waves. The 
large amplitude wave train that arrives later, marked “Ray¬ 
leigh,” is an example of a different type of wave. Such surface 
waves propagate along paths close to the earth’s surface. 

Figure 1.1-4 shows a seismogram from an earthquake at 
a depth of 650 km in the Tonga subduction zone recorded in 
Hawaii. The seismometer is oriented such that all the arrivals 
are shear waves. In addition to S and SS, phases reflected at 
the core-mantle boundary appear. ScS went down from the 
source, reflected at the core-mantle boundary (hence “c”), and 
came back up to the seismometer. Its travel time gives the depth 
to the core if the velocity in the mantle is known. Alternatively, 
if the depth to the core is known, the travel time gives a vertical 

The labels P and S come from the early days of seismology, when P stood for 
primary and S stood for secondary. 



in Tonga, recorded at Oahu (Hawaii), showing multiple core reflections. 


average of velocity with depth in the mantle. In addition, the 
large amplitude of these reflections constrains the contrast in 
physical properties between the solid rock-like lower mantle 
and the fluid iron outer core. Multiple reflections also occur: 
ScSScS, or ScS 2 , reflects twice at the core-mantle boundary, 
ScS 3 reflects three times, and ScS 4 four times. Similar to the 
phase SS, the S 3 wave reflects twice off the surface, and S 4 
reflects three times. By analogy to pP, sScS went upward 
from the source and was reflected first at the surface and then 
at the core-mantle boundary. Most of the multiple SS and 
ScS phases also have observable surface reflected phases 
(e.g., sScS 2 , sScS 3 , etc.). 

These examples indicate some of the ways in which seismo- 
logical observations are used to study earth structure. By col¬ 
lecting many such records, seismologists have compiled travel 
time and amplitude data for many seismic phases. Because the 
different phases have different paths, they provide multiple 
types of information about the distribution of seismic veloci¬ 
ties, and therefore physical properties within the earth. Seis- 
mology can also be used to study the internal structure of other 
planets; seismometers were deployed on the lunar surface by 
each of the Apollo missions, and the Viking spacecraft that 
landed on Mars carried a seismometer. 

An important use of seismology is the exploration of near¬ 
surface regions for scientific purposes or resource extraction. 
Figure 1.1-5 shows a schematic version of a common technique 
used. An artificial source at or near the surface generates 
seismic waves that travel downward, reflect off interfaces at 
depth, and are detected by seismometer arrays. The resulting 
data are processed using computers to enhance the arrivals cor¬ 
responding to reflections and to estimate the velocity structure. 
Seismograms from different receivers are then displayed side 
by side, with the travel time increasing downward, to yield an 
image of the vertical structure. Reflections that match between 
seismograms give near-horizontal arrivals that often corre¬ 
spond to interfaces at depth. The vertical axis can be converted 
from time to depth using the estimated velocities, and reflectors 
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Fig. 1.1-5 Schematic example of the seismic 
reflection method, the basic tool of 
hydrocarbon exploration. 
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Fig. 1.1-6 Data from a reflection seismic 
survey across the San Juan Basin, New 
Mexico ( bottom) and the resulting 
geological interpretation (fop). [Sangree 
and Widmier, 1979. Reprinted by 
permission of the Society of Exploration 
Geophysicists.) 


can be identified using geological information from the surface 
and drill holes (Fig. 1.1-6). Such seismic images of the sub¬ 
surface provide a powerful tool for structural and stratigraphic 
studies. Although applications of seismology to exploration 
have traditionally been treated in universities as distinct from 
those dealing with earthquakes and the large-scale structure of 
the earth, this distinction is largely historical. 2 These applica¬ 
tions draw on a common body of seismological principles, and 
the techniques used have considerable overlap. 


2 This book follows this tradition and focuses on earthquakes and large-scale earth 
structure because of the existence of an excellent introductory literature dealing with 
exploration seismology and the inflexibility of university curricula. 


Seismic sources — typically earthquakes — are also a major 
topic of seismological study. The location of an earthquake, 
known as the focus or hypocenter , is found from the arrival 
times of seismic waves recorded on seismometers at different 
sites. This location is often shown by the epicenter , the point 
on the earth’s surface above the earthquake. The size of earth¬ 
quakes is measured from the amplitude of the motion recorded 
on seismograms, and given in terms of magnitude or moment? 
In addition, the geometry of the fault on which an earthquake 

3 Magnitude is given as a dimensionless number measured in various ways, includ¬ 
ing the body wave magnitude m b , surface wave magnitude and moment magni¬ 
tude M w , as discussed in Section 4.6. The seismic moment has the dimensions of 
energy, dyn-cm or N-m. 
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Epicenter 


Fig. 1.1-7 First motions of seismic P waves observed 
at seismometers located in various directions about 
the earthquake allow the fault orientation to be 
determined. 




occurred is inferred from the three-dimensional pattern of radi¬ 
ated seismic waves. Figure 1.1-7 illustrates the method used for 
an earthquake in which the material on one side of a vertically 
dipping fault moves horizontally with respect to that on the 
other side. This motion generates seismic waves that propagate 
away in all directions. In some directions the ground first 
moves away from the source {toward a seismic station), 
whereas in other directions the ground first moves toward the 
source (away from a receiver). The seismograms thus differ 
between stations. In the “toward” (called compressional) 
quadrants the first ground motion recorded is toward the re¬ 
ceiver, whereas in the “away” (called dilatational) quadrants 
the first ground motion is away from the receiver. Because the 
seismic waves go down from the source, turn, and arrive at a 
distant seismographic station from below, the first motion 
is upward in a compressional quadrant and downward in a 
dilatational quadrant. 4 The compressional and dilatational 
quadrants can be identified using seismograms recorded at 
different azimuths around the source. The fault orientation and 
a surface perpendicular to it can then be found, because in 
these directions the first motion changes polarity. With the use 
of additional data we can often tell which of these surfaces 
was the actual fault. Given the fault orientation, the direction 
of motion can also be found; note that the compressional and 
dilatational quadrants would be interchanged if the fault had 
moved in the opposite direction. The pulse radiated from the 
earthquake also gives some information about the amount of 
slip that occurred, the size of the area that slipped, and the 
slip process. 

Such observations of the location of earthquakes and the 
fault motion that occurred in them are among the most import¬ 
ant data we have for understanding plate tectonics , the prim¬ 
ary process shaping our planet. The earthquake analyzed in 
Fig. 1.1-7, for example, is like those that occur along the San 
Andreas fault in northern California, part of the boundary 
along which the Pacific plate moves northward with respect 
to the North American plate. The fault is visible at the earth’s 


These terms are not the same as compressional and shear waves; as often occurs in 
science, words have multiple meanings. 


surface, so geological and geodetic observations also show the 
motion that occurs in earthquakes. In less accessible areas 
seismological observations provide most of the data used to 
identify the boundary along which motion occurs and to dem¬ 
onstrate its nature. This is the case for most plate boundaries, 
which occur in the oceans, beneath several kilometers of water. 
Similarly, in subduction zones, where lithospheric plates 
descend deep into the mantle and earthquakes can occur to 
depths of 660 km, direct observations are not possible, but 
analyses of seismograms reveal the motions and give insight 
into their tectonic causes. 


1.1.2 Models in seismology 

As summarized in the previous section, seismology provides a 
great deal of information about seismic sources, the structure 
of the earth, and the relation of earthquakes to the tectonic pro¬ 
cesses that produce them. Even so, we will see that there are 
major limitations on what the present seismological observa¬ 
tions and other data tell us. For example, although we have 
good models of seismic velocity in the earth, we know much 
less about the composition of the earth and have only general 
ideas about the deep physical processes, s uch as convection, 
thought tob^J^kiTTg^lace. Similarly, aithougl 
vide^a^great deal of detail about the slip that occurs 

earthquake, we still have only general ideas about how" 
earthquakes are related to tectonics, little understanding of the 
actual faulting process, no ability to predict earthquakes on 
i^me scales shorter than a hundred years, and only rudimentary 
i^ds to estimate earthquake hazards. This situatioi 
typical oTtfee^grthsciences, 5 largely becauseoftliejQenrriplexity 
of the processesbemgHudied and the Imutsofour observa¬ 
tions. Our best response seems to be to show humility in face of 
the complexity of nature, recognize what we presently know 


5 In discussing analogous issues Sarewitz and Pielke (2000) note than even after bil¬ 
lions of dollars spent on climate research, a senior scientist observes, “This may come 
as a shock to many people who assume that we do know adequately what’s going 
on with the climate, but we don’t,” and the National Academy of Sciences states that 
deficiencies in our understanding “place serious limitations on the confidence” of • 
climate modeling results. 
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and what we do not, use statistical techniques to assess what 
we can say with differing degrees of confidence from the data, 
and develop new data and techniques to do better. 

In general, the approach taken is to describe complex prob¬ 
lems with simplified models that seek to represent key elements 
of the process under consideration. For example, an earth¬ 
quake is a complicated rupture process that occurs in a finite 
volume and radiates seismic energy through the real materials 
of the earth. As we will see in the next few chapters, we rep¬ 
resent all aspects of this process with simple models. We treat 
the complex faulting process as elastic slip on an infinitely 
narrow surface. We further treat the rock around it as a simple 
elastic material, and thus describe the complex seismic wave 
disturbance that propagates through it, using a number of 
simplifications. 

It is important to bear in mind that these models are only 
approximations to a more complicated reality . For example, 
although the radiated seismic energy is real (it can destroy 
buildings), the mathematical descriptions used to understand it 
are human constructs. P waves, S waves, seismic phases like 
ScS , seismic ray paths, surface waves, or the earth’s normal 
modes are all approximations that make the radiated energy 
easier to conceptualize. Similarly, we model a fault as a planar 
slip surface and use seismological observations to characterize 
the slip geometry and history. However, although this process 
nicely replicates the seismic observations, it only approximates 
the actual physics of earthquake rupture. 

We often use a hierarchy of different approximations, as 
appropriate. For example, we might first predict the approx¬ 
imate time when a packet of seismic energy arrives by treating 
it as a seismic ray, and then use a more sophisticated wave or 
normal mode calculation to predict its amplitude and hence 
learn more about the properties of the parts of the earth it 
traversed. Similarly, we first describe the earth as isotropic 
(having the same properties in all directions) and purely elastic 
(no seismic energy is lost to heat by friction) and then confront 
the deviations from these simplifications. 

A similar approach is often followed when discussing the 
tectonic context of earthquakes. Although faults, earthquakes, 
volcanoes, and topography are real, we associate these with the 
boundaries of plates that are human approximations. We will 
see that the questions of when to regard a region as a plate and 
how to characterize its boundaries are not simple. The simplest 
analyses assume that plates are rigid and divided by narrow 
boundaries. Later, we treat the boundaries as broad zones, and 
eventually we confront the fact that plates are not perfectly 
rigid, but in fact deform internally, as shown by earthquakes 
that occur within them. 

We often choose a type of model to represent the earth 
and then use seismological and other data to estimate the 
parameters of this model. Thus a characteristic activity of 
seismology, and of the earth sciences in general, is solving 
inverse problems. We start with the end result, the seismo¬ 
grams, and work backwards using mathematical techniques to 
characterize the earthquakes that generated the seismic waves 


and the material the waves passed through. Inverse problems 
are more complicated than the conceptually simpler forward 
problems in which we use the theory of seismic wave genera¬ 
tion and propagation to predict the seismogram that would be 
observed for a given source and medium. Inverse problems are 
harder to solve for several reasons. Seismograms reflect the 
combined effect of the source and medium, neither of which is 
known exactly. There are often aspects of the inverse problem 
that the data are insufficient to resolve. Thus seismology and 
other branches of the earth sciences, to a greater extent than 
most other scientific disciplines, often infer a “big picture” from 
grossly limited and insufficient data. For example, our images 
of the earth from seismic waves suffer from the fact that the 
severely limited geographical distributions of both earthquakes 
and seismometers leave most of earth’s interior unsampled. This 
situation is like a doctor examining a possible broken bone with 
only a few scattered bursts of x-rays from random directions. 

Moreover, although the forward problem typically can be 
solved in a straightforward way, giving a unique solution, 
the inverse problem often has no unique solution. In fact, the 
data are generally somewhat inconsistent due to errors, so no 
model can exactly describe the data. Finally, the fact that solv¬ 
ing the inverse problem yields a set of model parameters that 
describe the observations well does not necessarily mean that 
the resulting model actually reflects physical reality. This non¬ 
uniqueness reflects the logical tenet that because a implies b , 
b does not necessarily imply a. In fact, we often have no way of 
determining what the reality is. For example, we will never 
truly know the composition and temperature of the earth’s core 
because we cannot go there. This limitation remains in spite 
of the fact that over time our models of the core have become 

increasingly consistent with seismological data, experimental 

results about materials at high pressure and temperature, and 

other data including inferences from meteorites about the 

composition of the solar system. 6 

A consequence of this approach is the need to consider issues 
of precision, accuracy, and uncertainty. Estimates of quantities 
like the magnitude or depth of an earthquake depend both on 
the precision, or repeatability, with which data like seismic 
wave arrival times and amplitudes are measured, and on the 
accuracy, or extent to which the resulting inferences correctly 
describe the earth. For example, earthquake magnitudes are 
simple measures of earthquake size, estimated in various ways 
from seismograms without accounting for effects like the geo¬ 
metry of the earthquake source or lateral variations in seismic 
velocities. Hence measurements at different sites yield various 
estimates, so it is of little value to argue whether an earthquake 
had magnitude 5.2 or 5.4. Similarly, focal depths are derived 
from seismic wave arrival times by assuming a velocity struc¬ 
ture near the earthquake, which is often not well known. For 

6 Similar difficulties afflict most of the earth sciences. Field geologists will never 
know whether their inferences about the past history and environment of a region 
are correct; paleontologists will never know how realistic their models of ancient 
life are, etc. 
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example, the depth is sometimes estimated (Section 4.3.3) from 
half the product of the time difference between the direct P and 
pP phases (see Fig. 1.1-3) and the velocity. If the time difference 
is measured to 0.25 s, and the velocity is 8 km/s, the method 
of propagation of errors (Section 6.5.1) shows that the uncer¬ 
tainty in depth is about 1 km, so it makes little sense to report 
the depth to greater precision. In reality the uncertainty will 
be greater, because the velocity also has some uncertainty. It is 
important to bear in mind that assigning a single value to an 
earthquake depth may exceed the relevant accuracy because 
faulting extends over a finite area that may be large (on the 
order of 10 km for a magnitude 6 earthquake) . Moreover, 
when we have alternative models with which to estimate 
a parameter (for example, the earthquake stress drop estim¬ 
ated from body waves depends on the assumed geometry of 
the fault), the uncertainty associated with an estimate using 
any particular model underestimates the uncertainty due to 
the fact that we do not know which model is best. It is thus 
useful to examine how the estimate depends on the precision 
of the observation, the model parameters, and the choice of 
models. 

Seismologists generally assume that the best estimates of 
values and uncertainties come from studies by different invest¬ 
igators using multiple datasets and techniques. Ideally, studies 
using the same data increase precision by reducing random 
errors, and studies using different data and techniques increase 
accuracy by reducing the effect of systematic errors. For ex¬ 
ample, for the well-studied Loma Prieta earthquake, seismic 
moment estimates vary by about 25%, and M s values vary by 
about 0.1 units. 

Flowever, statisticians have long noted the difficulties in as¬ 
sessing probabilities and uncertainties. Two famous examples 
are the Titanic , described as “unsinkable” (probability zero) 
and the space shuttle, which was lost on its twenty-fifth launch, 
surprisingly soon given the estimated probability of accident of 
1/100,000. Other examples come from the history of measure¬ 
ments of physical constants, which shows that the reported 
uncertainties underestimate the actual errors. For example, the 
27 successive measurements of the speed of light between 1875 
and 1958 are shown by subsequent analysis to be consistently 
in error by much more than the assigned uncertainty. It appears 
that assessments of the formal or random uncertainty often 
significantly underestimate the systematic error, so the overall 
uncertainty is dominated by the unrecognized systematic error 
and thus larger than expected. As a result, measurements of 
a quantity often remain stable for some time, and then change 
by much more than the previously assumed uncertainty. One 
possible explanation, termed the “bandwagon effect,” is the 
tendency to discount data that are inconsistent with previous 
ideas, but later prove more accurate than those included. 
Another effect appears to be the discarding of outliers: for 
example, although R. Millikan reported using all the observa¬ 
tions in his Nobel prize-winning (1910) study of the charge of 
the electron, his notebooks show that he discarded 49 of 107 
oil drops that appeared discordant, increasing the apparent 


precision of the result. Until a method is developed that 
excludes obviously erroneous data without discarding real 
disconforming evidence, making realistic uncertainty estimates 
will remain a challenge. Although such analyses are more 
difficult in the earth sciences — for example, an earthquake is a 
nonrepeatable experiment — they are useful to bear in mind. 

This discussion brings out the fact that although we often 
speak of “finding” or “determining” quantities like earth¬ 
quake source parameters or velocity structure, it might be 
better to speak of “estimating” or “inferring” these quantities. 
There is no harm in the common and more upbeat phrasing 
so long as we remember that these values reflect uncertainties 
due to random noise and errors of measurement (sometimes 
called aleatory uncertainty, after the Latin word for dice) 
and systematic (sometimes called epistemic ) uncertainty due 
to our choice of model to describe the phenomenon under 
consideration. 

Although these caveats sound worrisome, seismological 
models are far from useless. We can usually develop models 
that not only describe the data used to develop them, but to 
predict other data. For example, earthquake source models de¬ 
rived only from seismology often predict the observations 
made using field geology and geodesy (ground deformation), 
both for the specific earthquake studied and for others in the 
same region. Moreover, the seismological results often give 
useful insight that is consistent with other lines of evidence. For 
example, seismology, gravity, and geomagnetism all favor the 
earth having a dense liquid iron core chemically different from 
the rocky mantle. This idea is also consistent with the fact 
that meteorites — thought to be fragments of small planets — 
are divided into stony and iron classes. Hence seismologists 
use this modeling approach to understand the earth, while 
recognizing its limitations. 

For several reasons, our models usually improve with time. 
First, the data improve in both quantity and quality. Second, 
new observational and analytical techniques are introduced. 
As a result, long-standing problems such as the velocity struc¬ 
ture of the earth are repeatedly reassessed. Successive genera¬ 
tions of models seek to explain additional types of data, and 
often contain more model parameters in the hope of better rep¬ 
resenting the earth. Using statistical tests, we find that in some 
cases the resulting improvements are significant, whereas in 
others the new model improves only slightly on earlier ones. An 
important point is that more complicated models can always fit 
data better, because they contain more free parameters, just as 
a set of points in the x~y plane can be better fit by a quadratic 
polynomial than by a straight line. Thus we can statistically test 
models to see whether a new model reduces the misfit to the 
data more than would be expected purely by chance due to the 
additional parameters. Another useful test is whether the new 
or old models do a better job of describing data that were not 
used in deriving either, a process called pure prediction. When 
new models pass these tests, we can accept them — and then 
look again to see which data are still not described well and try 
to do better. 
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inversion modeling 



Fig. 1.1-8 Schematic illustration of how models of earth processes 
advance with time due to additional data and improved model 
parameterizations. 

Over the years this process leads to a better understanding 
of how the earth works (Fig. 1.1-8). For example, Fig. 1.1-9 
summarizes the development of global plate motion models, 
discussed in Chapter 5, that give the motion of the dozen or so 
major plates. The models are derived by inverting data consist¬ 
ing of the directions of plate motions along transform faults, 
the directions of plate motions during earthquakes, and the 
rates of plate motions shown by sea floor magnetic anomalies. 


Since 1972, when the first such model was made, the amount of 
available data has increased, and the data have become better, 
due to advances in seismology, sea floor imaging, and marine 
magnetic measurements. Similarly, the fit to the data has 
improved (or the misfit reduced) due both to the higher data 
quality and to improvements in the model, such as treating 
India and Australia as separate plates. Similar patterns of 
increased data and improved fit occur for many applications, 
including seismic velocity structure in the earth. 

Many of the same issues surface when considering the 
models used to describe earth processes. For example, we will 
see that there are various models for what occurs at the core¬ 
mantle boundary or what causes earthquakes within down¬ 
going plates at subduction zones. Such models assume that a 
particular set of physical processes occur, and show that for 
apparently plausible values of the (often unknown) relevant 
physical parameters, some behavior like that observed might 
be expected. Although these simple models attempt to reflect 
key aspects of the complex natural system, we often have no 
way of telling if and how well they succeed. Typically, various 
plausible models are suggested, all of which may in part be true 
and offer interesting insights into what may be occurring. The 
data often do not allow discrimination between them, so the 
model one prefers depends on one's geological instincts and 
prejudices, and models go in and out of vogue. A common 
scenario is for a model to become the consensus of the small 
group of researchers most interested in a problem, and then be 
challenged by fresh ideas or data from the outside. Hence, criti¬ 
cally examining conventional wisdom often leads to discarding 
or modifying it, and so making progress in keeping with the 


Compare old and new model 
predictions to new data, 
not used in deriving either 


Identify and investigate 
remaining misfits 
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IIHHlll transforms 




Fig. 1.1-9 Evolution of successive global 
plate motion models, as the amount of data 
increases and the misfit is reduced. Left: 
Number of data used to derive the models. 
Three types of data are inverted: earthquake 
slip vector azimuths, transform fault 
azimuths, and spreading rates. Right: The 
misfit to NUVEL-1 data for the various 
models. The vertical bars showing total 
misfit are separated into segments giving the 
misfit to each type of data. (DeMets et al ., 
1990. Geophys.J . Int ., 101, 425-78.) 
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Fig. 1.1-10 Tectonic cartoon for oceanic and continental margin trenches, 
prior to the acceptance of plate tectonics. The association of dip-slip 
earthquakes with trenches, volcanism, and mountain ranges was 
recognized. Note the exaggeration of surface relief. (Benioff, 1955. From 
Crust of the Earth , ed. A. Poldervaart. Reproduced with permission of the 
publisher, the Geological Society of America, Boulder, CO. Copyright © 
1955 Geological Society of America.) 


ancient Jewish sages’ observation that “the rivalry of scholars 
increases wisdom.” 7 This process requires a constant cycle of 
learning and unlearning in which old models are discarded, 
even by those who helped create them, in favor of new models. 

The classic geological example of advancing beyond conven¬ 
tional thinking is the plate tectonic revolution of the late 1960s. 
Although the idea of continental drift had been around for a 
long time and was strongly advocated by Alfred Wegener in 
1915, it was not accepted by most of the geological community 
in the USA and Europe, 8 in part because seismological pioneer 
Harold Jeffreys argued that it was impossible. As a result, 
although it was recognized in the 1950s that earthquakes 
occurred on mid-ocean ridges that were young volcanic fea¬ 
tures and at deep sea trenches in association with volcanoes 
and mountain ranges (Fig. 1.1-10), their underlying nature was 
not understood. However, once paleomagnetic and marine 
geophysical data led to the recognition that oceanic lithosphere 
formed at mid-ocean ridges and subducted at trenches, the 
seismological observations made sense. 

Thus, as in other sciences, progress in understanding seis¬ 
mological problems is typically incremental during “normal 
science” periods, in which we make small steady advances. 
Occasionally, however, exciting “paradigm shifts” occur when 
important new ideas change our views from our previous con¬ 


Alternative formulations of this idea include David Jackson’s observation, 
(Hschman, 1992); “as soon as I hear ‘everybody knows’ I start asking ‘does everybody 
know this, and how do they know it?”’ the quotation used as the epigraph to 
this book by Nobel Laureate Peter Medewar; and the adage attributed to 1960s 
political activist Abbie Hoffman that “sacred cows make the best hamburger.” 

Interestingly, many geologists in Southern Hemisphere countries like Australia 
and South Africa accepted continental drift early on and never abandoned it. 


ventional thinking and permit great advances. This concept, 
developed by philosopher of science Thomas Kuhn (1962) for 
science-wide conceptual revolutions like the theory of plate 
tectonics, also describes progress in subfields. It is particularly 
apt in seismology, because many major faults move at most 
slightly for many years — and then break dramatically in large 
earthquakes. 

1.2 Seismology and society 

Seismology impacts society through applications including 
seismic exploration for resources, earthquake studies, and 
nuclear arms control. These topics involve both scientific and 
public policy issues beyond our focus on using seismic waves to 
study earth structure, earthquakes, and plate tectonics. How¬ 
ever, given the natural interest of these societal applications, 
we briefly discuss some issues in earthquake hazard analysis 
and nuclear test monitoring, in part to motivate our discussions 
of the basic science. 

These topics have the interesting feature that the state of 
seismological knowledge influences policy, so scientific uncer¬ 
tainties have broad implications. The choice of earthquake pre¬ 
paredness strategies depends in part on how well earthquake 
hazards can be assessed, and nations’ willingness to negotiate 
test ban treaties depend in part on their confidence that com¬ 
pliance can be verified seismologically. Seismology thus faces 
the challenge, familiar in other applications like global warm¬ 
ing or biotechnology, of explaining both knowledge and its 
limits. Failure to do so can have embarrassing consequences. 
For example, since the 1960s the Japanese government has 
spent more than $1 billion on an earthquake prediction pro¬ 
gram premised on the idea that large earthquakes will be 
preceded by observable precursory phenomena, despite the 
fact that (as discussed shortly) many seismologists increasingly 
doubt that such phenomena exist. This approach has so far 
failed to predict destructive earthquakes, like that which struck 
the Kobe area in 1995, and has focused most of its efforts on 
areas other than those where these earthquakes occurred. 
Critics have thus argued that the program is scientifically weak, 
diverts resources that could be more usefully employed for 
basic seismology and earthquake engineering, and gives the 
public the misleading impression that earthquakes can cur¬ 
rently be predicted. Based on the program’s record to date, the 
government would have been wiser to listen to these critics and 
to have been more candid with the public. 1 


1 Such issues were eloquently summarized by Richard Feynman’s (1988) admoni¬ 
tion after the loss of the space shuttle Challenger: “NASA owes it to the citizens from 
whom it asks support to be frank, honest, and informative, so these citizens can 
make the wisest decisions for the use of their limited resources. For a successful 
technology, reality must take precedence over public relations, because nature cannot 
be fooled.” 




10 Introduction 



70 

60 

50 

40 

30 

20 

10 

0 

-10 

-20 

-30 

-40 

-50 

-60 


-70 


Fig. 1.2-1 Map showing epicenters of all earthquakes during 1963-95 with magnitudes of m h > 4. Most earthquakes occur along the boundaries between 
tectonic plates. Where these boundaries are distinct, the earthquakes occur within narrow bounds. More diffuse plate boundaries, like the Himalayan 
plateau between India and China, show a much broader distribution of epicenters. 
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earthquakes and other phenomena. The 
magnitude used is moment magnitude, M w . 
(After Incorporated Research Institutions 
for Seismology.) 


Latitude (°) 















1.2 Seismology and society 11 


1.2.1 Seismic hazards and risks 

One of the primary motivations for studying earthquakes and 
seismology is the destruction caused by large earthquakes. In 
many parts of the world, seismic risks are significant, whether 
they are popularly recognized {as in Japan, where schools con¬ 
duct earthquake drills) or not. Much of the challenge in assess¬ 
ing and addressing seismic hazards is that in any given area 
large earthquakes are relatively rare on human time scales, but 
can cause great destruction when they occur. 

Earthquakes primarily occur at the boundaries where the 
lOOkm-thick tectonic plates converge, diverge, or slide past 
each other. Although the plates move steadily, their boundaries 
are often “locked,” and do not move most of the time. How¬ 
ever, on time scales of a few hundred years, the boundary slips 
suddenly, and the accumulated motion is released in an earth¬ 
quake. Figure 1.2-1 shows the locations of m b > 4 earthquakes 
between 1963 and 1995. The earthquakes nicely define the 
plate boundaries, although some earthquakes also occur in 
intraplate regions, away from plate boundaries. 

The energy released by large earthquakes is striking (Fig. 1.2- 
2). For example, the 1906 San Francisco earthquake involved 
about 4 m of slip on a 450 km-long fault, releasing about 
3 x 10 16 Joules 2 of elastic energy. This energy is equivalent to 
a 7 megaton nuclear explosion, much larger than the 0.012 
megaton bomb dropped on Hiroshima. The largest recorded 
earthquake, the 1960 Chilean event in which about 21 m of 
slip occurred on a fault 800 km long and 200 km across, 
released about 10 19 J of elastic energy, more than a 2000 Mt 
bomb. This earthquake released more energy than all the 
nuclear bombs ever exploded, the largest of which was 58 Mt. 
For comparison, the total global human annual energy con¬ 
sumption is about 3 x 10 20 J. 

Fortunately, the largest earthquakes are infrequent, because 
the energy released accumulates slowly over a long time. The 
San Francisco earthquake occurred on the San Andreas fault 
in northern California, part of the boundary along which the 
Pacific plate moves northward relative to the North American 
plate. Studies using the Global Positioning System satellites 
show that away from the plate boundary the two plates move 
by each other at a speed of about 45 mm/yr. Most parts of 
the San Andreas fault are “locked” most of the time, but slip 
several meters in a large earthquake every few hundred years. 
A simple calculation suggests that such earthquakes should oc¬ 
cur on average about every 4000 mm/(45 mm/yr) or 90 years. 
The real interval is not uniform, for reasons that are unclear, 
and is longer, because some of the motion occurs on other 
faults. 

Because plate boundaries extend for more than 150,000 km, 
and some earthquakes occur in plate interiors, earthquakes 
occur frequently somewhere on earth. As shown in Table 1.2-1, 

The SI unit of energy is 1 Joule (J) = 1 Newton meter (N-m) = 10 7 ergs = 10 7 dyn- 
cm * Nuclear explosions are often described in megatons (Mt), equivalent to 
1,000,000 tons of TNT or 4.2 x 10 15 j. 


Table 1.2-1 Numbers of earthquakes per year. 


Earthquake 
magnitude (M s ) 

Number 
per year 

Energy released 
(10 15 J/yr) 

>8.0 

0-1 

0-1,000 

7-7.9 

12 

100 

6-6.9 

110 

30 

5-5.9 

1,400 

5 

4-4.9 

13,500 

1 

3-3.9 

>100,000 

0.2 


Based upon data from the US Geological Survey National Earthquake 
Information Center. Energy estimates are based upon an empirical 
formula of Gutenberg and Richter (Gutenberg, 1959), and the magnitude 
scaling relations of Geller (1976), and are very approximate. 

an earthquake of magnitude 7 occurs approximately monthly, 
and an earthquake of magnitude 6 or greater occurs on average 
every three days. 3 Earthquakes of a given magnitude occur 
about ten times less frequently than those one magnitude 
smaller. Because the magnitude is proportional to the logarithm 
of the energy released, most of the energy released seismically is 
in the largest earthquakes. A magnitude 8.5 event releases more 
energy than all the other earthquakes in a given year combined. 
Hence the hazard from earthquakes is due primarily to large 
(typically magnitude greater than 6.5) earthquakes. 

In assessing the potential danger posed by earthquakes or 
other natural disasters, it is useful to distinguish between haz¬ 
ards and risks . The hazard is the intrinsic natural occurrence of 
earthquakes and the resulting ground motion and other effects. 
The risk is the danger the hazard poses to life and property. 
Hence, although the hazard is an unavoidable geological fact, 
the risk is affected by human actions. Areas of high hazard can 
have low risk because few people live there, and areas of 
modest hazard can have high risk due to large populations and 
poor construction. Earthquake risks can be reduced by human 
actions, whereas hazards cannot (hence the US government’s 
National Earthquake Hazards Reduction Program is, strictly 
speaking, misnamed). 

These ideas are illustrated by Table 1.2-2, which lists some 
significant earthquakes and their societal consequences. As 
shown, some very large earthquakes caused no fatalities 
because of their remote location or deep focal depth. In general, 
the most destructive earthquakes occur where large popula¬ 
tions live near plate boundaries. The highest property losses 
occur in developed nations where more property is at risk, 
whereas fatalities are highest in developing nations. Although 
the statistics are often imprecise, the impact of major earth¬ 
quakes can be enormous. Estimates are that the 1990 Northern 
Iran shock killed 40,000 people, and that the 1988 Spitak 

3 As part of his incorrect prediction of a magnitude 7 earthquake in the Midwest in 
1990,1. Browning claimed that he had successfully predicted the 1989 Loma Prieta 
earthquake. In fact, he had said that near the date in question there would be an earth¬ 
quake somewhere in the world with magnitude 6, a prediction virtually guaranteed to 
be true. 
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Table 1.2-2 Some notable and destructive earthquakes. (Values in this table are compiled from various sources, and different estimates have been reported, 
especially for older earthquakes.) 


Location and date 

Strength 

Kourion, Cyprus 

X 

July 21, 365 

MMI 

Basel, Switzerland 

XI 

October 18,1356 

MMI 

Shansi, China 

8 

January 23,1556 

M s (est.) 

Port Royal, Jamaica 

8 

June 7, 1692 

M s (est.) 

Lisbon, Portugal 

>8 

November 1,1755 

M s (est.) 

New Madrid, MO 

7-7.4 

Dec. 1811 to Feb. 1812 

M s (est.) 

Charleston, SC 

7.2 

August 31, 1886 

M s (est.) 

Sanriku, Japan 

8.5 

June 15, 1896 

M s (est.) 

Assam, India 

8.7 

June 12, 1897 

M s (est.) 

San Francisco, CA 

7.8 

April 18, 1906 

M s 

Kansu, China 

8.5 

December 16,1920 

M s 

Tokyo, Japan 

8.2 

September 1,1923 

M s 

Aleutian Islands, Alaska 

1A 

April 1, 1946 

M s 

Lituya Bay, Alaska 

7.0 

July 10, 1958 

M s 

Hebgen Lake, MT 

7.5 

August 17,1959 

Ms 

Chile 

9.5 

May 21, 1960 

M w 

Alaska 

9.1 

March 27, 1964 

M w 

Peru 

7.8 

May 31,1970 

M s 

San Fernando Valley, CA 

6.6 

February 9,1971 

M s 

Haicheng, China 

7.4 

February 4,1975 

M s 

Kaiapana, Hawaii 

7.1 

November 29, 1975 

M s 

Tangshan, China 

7.6 

July 27, 1976 

M s 

Mexico City, Mexico 

7.9 

September 19,1985 

M s 

Spitak, Armenia 

6.8 

December 7,1988 

M s 

Loma Prieta, CA 

7.1 

October 17,1989 

M s 

Caspian Sea, Iran 

7.7 

June 20, 1990 

M s 

Luzon, Philippines 

7.8 

July 16, 1990 

M s 

Landers, CA 

7.3 

June 28,1992 

M w 


Effects 

Total destruction of this Greco-Roman city. Very large tsunami in the Mediterranean. 

Eighty castles destroyed over a wide area. 300 killed. Toppled cooking hearths caused fires that burned for 
many days. 

Collapse of cave dwellings carved into bluffs of soft glacial loess. 830,000 reported killed (worst ever). Near the 
1920 Kansu earthquake (see below). 

Widespread liquefaction caused one-third of Port Royal to spread and sink 4 m beneath the ocean surface. 

2500 killed. 

Large tsunamis seen all around the Atlantic. Felt over 1,600,000 km 2 . Algiers destroyed. 70,000 killed. Largest 
documented earthquake in Europe (though several Italian quakes have killed >150,000 in past 500 years). 

Three large quakes (Dec. 16, 1811, Jan. 23, 1812, Feb. 7, 1812). Vertical movements up to 7 m. Widespread 
liquefaction. Changed course of Mississippi River. Felt over 5,000,000 km 2 . 

No previous seismicity observed in this area between 1680 and 1886. Felt over 5,000,000 km 2 .14,000 chimneys 
damaged or destroyed. 90% of buildings damaged/destroyed. 60 killed. 

Tsunamis 35 m high washed away 10,000 houses and killed 26,000 along the Sanriku coast of Honshu. A similar 
Sanriku quake on March 2,1933, killed 3000 with a 25 m high tsunami. 

One of the largest quakes ever felt. 1500 killed. Extremely violent ground shaking. Other Himalayan events on 
April 4, 1905 (20,000 killed), January 15, 1934 (10,000 killed), and August 15, 1950 (Ms = 8.6, 1526 killed). 

About 4 m of slip on a 450 km-long fault. 28,000 buildings destroyed, largely by fires that burned for 3 days. 
2500-3000 killed by fires (worst in USA). 

180,000 killed, largely by downslope flow of liquefied soil over more than 1.5 km. 

Occurred in Sagami Bay, 80 km south of Tokyo. 134 separate fires merged to become a giant firestorm. 12 m 
tsunami hit shores of Sagami Bay. 143,000 killed. 

Large tsunami destroyed a power station and caused $25 million in damage in Hilo, Hawaii, where it rose to 7 m 
in height. 

Massive landslides that slid into a local bay created a 60 m-high wave that washed up mountain sides as far as 
540 m. 

Extensive landslides, including one that dammed a river and created a lake. Reactivated 160 Yellowstone 
geysers. Vertical displacement up to 6.5 m. 28 killed. 

Largest quake ever recorded. Fault area: 800 by 200 km. Slip: 21 m. Triggered eruption of Puyehue volcano. 
Massive landslides in Andes. Giant tsunami. 2000-3000 killed. 

2 nd largest quake ever recorded. Fault area: 500 by 300 km. Slip: 7 m. Large tsunamis, and widespread 
liquefaction. 200,000 km 2 of crustal surface deformed. 131 killed. 

Quake offshore caused large landslides. 30,000 killed, largely by 100,000,000 m 3 of rock and ice flowing down 
Andes mountain sides. 

Felt over more than 200,000 mi 2 . 65 killed. 1000 injured. More than $500 million in direct losses. 

Successful prediction said to have led to an evacuation on the morning of the quake that possibly saved 
100,000s of lives. 300-1200 killed. 

South flank of Kiluea volcano slid seaward. 14.6 m-high tsunami on Hawaiian shores. Largest Hawaiian 
earthquake since a 1868 quake that caused 22 m-high tsunamis and killed 148. 

Of a city of 1 million, >250,000 killed and 50,000 injured. Exact numbers speculative: fatalities may have 
exceeded the 1556 earthquake. In contrast to the 1975 Haicheng quake, this had no precursory behaviors. 

Strong shaking lasted for 3 minutes due to sedimentary lake-fill oscillations. 10,000 killed. 30,000 injured. 

$3 billion in damage. 

Surface faulting showed 1.5 m of slip along a 10 km fault. 25,000 killed. 19,000 injured. 500,000 homeless. 

$6.2 billion in damages. 

Slip along San Andreas segment south of San Francisco. 63 killed, most from the collapse of an elevated freeway 
in Oakland. About $6 billion in damages. Disrupted 5th game of World Series. 

100,000 structures damaged or destroyed. 40,000 killed. 60,000 injured. 500,000 left homeless. Over 
700 villages destroyed, and another 300 damaged. 

Major rupture of Digdig fault, causing many landslides and major surface faulting. Extensive soil liquefaction. 
1621 killed. 3000 injured. 

Up to 6 m of horizontal displacement and 2 m of vertical displacement along a 70 km fault segment. 

1 killed. 400 injured. 
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Table 1.2-2. {coni’d). 


Location and date Strength Effects 


Flores Island, Indonesia 

7.8 

December 12,1992 

M s 

Northridge, CA 

6.7 

January 17, 1994 

M w 

Northern Bolivia 

8.2 

June 9,1994 

M s 

Kobe, Japan 

6.8 

January 16,1995 

M s 

NW of Balleny Islands 

8.2 

March 25, 1998 

M w 

izmit, Turkey 

7.4 

August 17,1999 

M s 

Chi-Chi, Taiwan 

7.6 

September 21, 1999 

M w 


Tsunami heights reached 25 m. Extensive shoreline damage, where tsunami run-up was up to 300 m. 

2200 killed. 30,000 buildings destroyed. 

Rupture on a blind thrust fault beneath Los Angeles. Many rock slides, ground cracks, and soil liquefaction. 

58 killed. 7000 injured. 20,000 homeless. About $20 billion in damages. 

Largest deep earthquake ever (depth was 637 km). Felt as far away as Canada. 

5502 killed. 36,896 injured. 310,000 homeless. Massive destruction to world's 3 rd largest seaport: 193,000 
buildings, $100 billion in damages (highest to date). 

Largest oceanic intraplate earthquake ever. Occurred west of Australia-Pacific-Antarctic plate triple junction in 
a region that was previously aseismic. 

5 m slip. 120 km rupture. 30,000 killed. $20 billion in economic loss. 12 major (M > 6.7) events this century have 
broken a total of 1000 km of the North Anatolian fault, including a 7.2 Mw aftershock on Nov. 12,1999. 

150 km south of Taipei. 2333 killed. 10,000 injured. >100,000 homeless. Extensive seismic monitoring in Taiwan 
makes this one of the best seismically sampled earthquakes. One of largest observed surface thrust scarps. 


(Armenia) earthquake killed 25,000. Even in Japan, where 
modern construction practices are used to reduce earthquake 
damage, the 1995 Kobe earthquake caused more than 5000 
deaths and $100 billion of damage. On average during the 
past century earthquakes have caused about 11,500 deaths per 
year. As a result, earthquakes have had a significant effect upon 
the history and culture of many regions. 

The earthquake risk in the United States is much less than in 
many other countries because large earthquakes are relatively 
rare in most of the country and because of earthquake-resistant 
construction. 4 The most seismically active area is southern 
Alaska, a subduction zone subject to large earthquakes. How¬ 
ever, the population there is relatively small, so the 1964 earth¬ 
quake (the second largest ever recorded instrumentally) caused 
far fewer deaths than a comparable earthquake would have in 
Japan. The primary earthquake impact in recent years has been 
in California. The 1994 Northridge earthquake killed 58 peo¬ 
ple and caused about $20 billion worth of damage in the Los 
Angeles area, and the 1989 Loma Prieta earthquake that shook 
the San Francisco area during a 1989 World Series baseball 
game killed 63 people and did about $6 billion worth of 
damage. Both these earthquakes were smaller (magnitude 6.8 
and 7.1, respectively) than the largest known to occur on the 
San Andreas fault, such as the 1906 San Francisco earthquake, 
which had a magnitude of about 7.8. 

Compared to other risks, earthquakes are not a major 
cause of death or damage in the USA. Most earthquakes do 
little harm, and even those felt in populated areas are com¬ 
monly more of a nuisance than a catastrophe. Since 1811, 
US earthquakes have claimed an average of nine lives per year 
(Table 1.2-3), putting earthquakes at the level of in-line skating 


Table 1.2-3 Some causes of death in the United States, 1996. 


Cause of death 

Number of deaths 

Heart attack 

733,834 

Cancer 

544,278 

Stroke 

160,431 

Lung disease 

106,143 

Pneumonia/influenza 

82,579 

Diabetes 

61,559 

Motor vehicle accidents 

43,300 

AIDS 

32,655 

Suicide 

30,862 

Liver disease/cirrhosis 

25,135 

Kidney disease 

24,391 

Alzheimer's 

21,166 

Homicide 

20,738 

Falling 

14,100 

Poison 

10,400 

Drowning 

3,900 

Fires 

3,200 

Suffocation 

3,000 

Bicycle accidents 

695 

Severe weather 1 

514 

In-line skating 2 

25 

Football 2 

18 

Skateboards 2 

10 

Earthquakes (1811-1983), 3 per year 

9 

Earthquakes (1984-98), per year 

9 


1 From the National Weather Service (property loss due to severe weather 
is $10-15 billion/yr, comparable to the Northridge earthquake, and that 
from individual hurricanes can go up to $25 billion). 

2 From the Consumer Product Safety Commission. 

3 From Gere and Shah (1984). 

All others from the National Safety Council and National Center for 
Health Statistics. 


' Many seismologists have faced situations like explaining to apprehensive 
.telephone callers that the danger of earthquakes is small enough that the callers’ 
tipcoming family vacations to Disneyland are not suicidal ventures. 
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or football, 5 but far less than bicycles, for risk of loss of 
life. Similarly, the $20 billion worth of damage from the 
Northridge earthquake, though enormous, is about 10% of the 
annual loss due to automobile accidents. As a result, earth¬ 
quakes pose an interesting challenge to society because they 
cause infrequent, but occasionally major, fatalities and dam¬ 
age. Society seems better able to accept risks that are more 
frequent but where individual events are less destructive. 6 

Similar issues surface when society must decide the costs, 
benefits, and appropriateness of various measures to reduce 
earthquake risks. Conceptually, the issues are essentially those 
faced in daily life. For example, a home security system costing 
$200 per year makes sense if one anticipates losing $1000 in 
property to a burglary about every five years ($200/year), but 
not if this loss is likely only once every 25 years ($40/year). 
However, the analysis is difficult, because the limited historical 
record of earthquakes makes it hard to assess their recurrence 
and potential damage. 

Seismology is used in various ways to try to mitigate earth¬ 
quake risks. Studies of past earthquakes are integrated with 
other geophysical data to forecast the location and size of 
future earthquakes. These estimates help engineers design 
earthquake-resistant structures, and help engineers and public 
authorities estimate and prepare for future damage by develop¬ 
ing codes for earthquake-resistant construction. Seismology is 
also used by the insurance industry to develop rates for earth¬ 
quake insurance, which can reduce the financial losses due to 
earthquakes and provide the resources for economic recovery 
after a damaging earthquake. Rates can be based on factors in¬ 
cluding the nature of a structure, its location relative to active 
faults, and soil conditions. Homeowners and businesses then 
decide whether to purchase insurance, depending on their per¬ 
ceived risk and the fact that damages must exceed a deductible 
amount (10-15% of the insured value) before the insurance 
company pays. A complexity for the insurer is that, unlike 
automobile accidents, whose occurrence is relatively uniform, 
earthquakes or other natural disasters are rare but can produce 
concentrated damage so large as to imperil the insurer’s ability 
to pay claims. Approaches to this problem include limits on 
how much a company will insure in a given area, the use of 
reinsurance by which one insurance company insures another, 
catastrophe bonds that spread the financial risk into the global 
capital market, and government insurance programs. 

1 . 2.2 Engineering seismology and earthquake engineering 

Most earthquake-related deaths result from the collapse of 
buildings, because people standing in an open field during a 
large earthquake would just be knocked down. Thus it is often 
stated that in general “earthquakes don’t kill people; buildings 

5 These figures are for American football; in other countries soccer, termed football 
there, is safer for players but more dangerous for spectators. 

6 For example, although considerable attention is paid to aviation disasters and 
safety, far more lives could be saved at far less cost by enforcing automobile seat belt 
laws. 


kill people.” As a result, proper construction is the primary 
method used to reduce earthquake risks. This issue is addressed 
by engineering seismology and earthquake engineering, dis¬ 
ciplines at the interface between seismology and civil engineer¬ 
ing. Their joint goal is to understand the earthquake ground 
motions that can damage buildings and other critical struc¬ 
tures, and to design structures to survive them or at least ensure 
the safety of the inhabitants. 

These studies focus on the strong ground motion near earth¬ 
quakes that is large enough to do damage, rather than the much 
smaller and often imperceptible ground motions used in many 
other seismological applications. Two common measures are 
used to characterize the ground motion at a site. One is the ac¬ 
celeration,, or the second time derivative of the ground motion. 
Accelerations are primarily responsible for building destruc¬ 
tion. A house would be unharmed on a high-speed train going 
along a straight track, where there is no acceleration. However, 
during an earthquake the house will be shaken and could be 
damaged if the accelerations were large enough. These issues 
are investigated using seismometers called accelerometers that 
can operate during violent shaking close to an earthquake but are 
less sensitive to the smaller ground motion from distant earth¬ 
quakes. The seismic hazard to a given area is often described 
by numerical models that estimate how likely an area is to ex¬ 
perience a certain acceleration in a given time. For example, the 
hazard map in Fig. 1.2-3 predicts the maximum acceleration 
expected at a 2% probability in the next 50 years, or at least 
once during the next 2500 (50/0.02) years. These values are 
given as a fraction of “g,” the acceleration of gravity (9.8 m/s 2 ). 

A second way to characterize strong ground motion uses 
intensity , a descriptive measure of the effects of shaking. 
Table 1.2-4 shows values for the commonly used Modified 
Mercalli intensity (MMI) scale, which uses roman numerals 
ranging from I (generally unfelt) to XII (total destruction). 
Intensity is not uniquely related to acceleration, which is a 
numerical parameter that seismologists compute for an earth¬ 
quake and engineers use to describe building effects. The table 
shows an approximate correspondence between intensity and 
acceleration, but this can vary. However, intensity has the 
advantage that it is inferred from human accounts, and so can 
be determined where no seismometer was present and for 
earthquakes that occurred before the modern seismometer was 
invented (about 1890). Although intensity values can be 
imprecise (a fallen chimney can raise the value for a large area), 
they are often the best information available about historic 
earthquakes. For example, intensity data provide much of 
what is known about the New Madrid earthquakes of 1811 
and 1812 (Fig. 1.2-4). These large earthquakes are interesting 
in that they occurred in the relatively stable continental interior 
of the North American plate (Section 5.6). Historical accounts 
show that houses fell down (intensity X) in the tiny Mississippi 
river town of New Madrid, and several chimneys toppled 
(intensity VII) near St Louis. Intensities can be used to infer 
earthquake magnitudes, albeit with significant uncertainties. 
These data have been used to infer the magnitude (about 7.2 ± 
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Fig. 1.2-3 A map of estimated earthquake hazards in the United States. The predicted hazards are plotted as the maximum acceleration of ground shaking 
expected at a 2% probability over a 50-year period. Although the only active plate boundaries are in the western USA, other areas are also shown as having 
significant hazards. (Courtesy of the US Geological Survey.) 


0.3 in the study shown) and fault geometry of the historic on the types of buildings. As shown in Fig. 1.2-6, reinforced 

earthquakes and to give insight into the effects of future ones. concrete fares better during an earthquake than a timber frame, 

The variation in ground motion with distance from an which does better than brick or masonry. Hence, as also shown 

earthquake can be seen by plotting lines of constant intensity, in Table 1.2-4, serious damage occurs for about 10% of brick 

known as zsosmm^/s. Typically, as illustrated in Fig. 1.2-4, the buildings starting above about intensity VII (about 0.2 g), 

intensity decays with distance from the earthquake. Similarly, whereas reinforced concrete buildings have similar damage 

strong motion data show that the variation in acceleration a only around intensity VIII-IX (about 0.3-0.5 g). Buildings 

with earthquake magnitude M and distance r from the earth- designed with seismic safety features do even better. The worst 

quake can be described approximately by relations like earthquake fatalities, such as the approximately 25,000 deaths 

in the 1988 Spitak (Armenia) earthquake, occur where many of 
a{M, r) — bl0 cM r~ d , (1) the buildings are vulnerable (Fig. 1.2-7). Hence a knowledge¬ 

able observer 7 estimated that an earthquake of this size would 
where b, c, and d are constants that depend on factors includ- cause approximately 30 deaths in California. This estimate 

ing the geology of the area in question, the earthquake depth proved accurate for the 1989 Loma Prieta earthquake, which 

and fault geometry, and the frequency of ground motion. was slightly larger and killed 63 people. 

Flence the predicted ground acceleration increases with earth- Designing buildings to withstand earthquakes is a technical, 
quake magnitude and falls off rapidly with distance at a rate economic, and societal challenge. Research is being directed to 

depending on the rock type. For example, rocks in the USA east better understand how buildings respond to ground motion 

of the Rocky Mountains transmit seismic energy better than and how they should be built to best survive it. Because such 

those in the western USA (Section 3.7.10), so earthquakes in design raises construction costs and thus diverts resources from 

the East are felt over a larger area than earthquakes of the same other uses, some of which might save more lives at less cost 

size in the West (Fig. 1.2-5). Because the shaking decays rapidly or otherwise do more societal good, the issue is to assess the 

with distance, nearby earthquakes can do more damage than seismic hazard and choose a level of earthquake-resistant 

larger ones further away. 

The damage resulting from a given ground motion depends 7 Ambraseys ( 1989 ). 
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Table 1.2-4 Modified Mercalli intensity scale. 


Intersity Effects 


I Shaking not felt, no damage: not felt except by a very few under especially favorable circumstances. 

II Shaking weak, no damage: felt only by a few persons at rest, especially on upper floors of buildings. Delicately suspended objects 
may swing. 

III Felt quite noticeably indoors, especially on upper floors of buildings, but many people do not recognize it as an earthquake. 
Standing automobiles may rock slightly. Vibration like passing of truck. Duration estimated. 

IV Shaking light, no damage: during the day felt indoors by many, outdoors by very few. At night some awakened. Dishes, windows, 
doors disturbed; walls make creaking sound. Sensation like heavy truck striking building. Standing automobiles rocked noticeably 
(0.015-0.02 g) 

V Shaking moderate, very light damage: felt by nearly everyone, many awakened. Some dishes, windows, and so on broken; cracked 
plaster in a few places; unstable objects overturned. Disturbances of trees and poles, and other tall objects sometimes noticed. 
Pendulum clocks may stop. (0.03-0.04 g) 

VI Shaking strong, light damage: felt by all, many frightened and run outdoors. Some heavy furniture moved; a few instances of 
fallen plaster and damaged chimneys. Damage slight. (0.06-0.07 g) 

VII Shaking very strong, moderate damage: everybody runs outdoors. Damage negligible in buildings of good design and 
construction; slight to moderate in well-built ordinary structures; considerable in poorly built or badly designed structures; some 
chimneys broken. Noticed by persons driving cars. (0.10-0.15 g) 

VIM Shaking severe, moderate to heavy damage: damage slight in specially designed structures; considerable in ordinary substantial 

buildings with partial collapse; great in poorly built structures. Panel walls thrown out of frame structures. Fall of chimneys, factory 
stacks, columns, monuments, walls. Heavy furniture overturned. Sand and mud ejected in small amounts. Changes in well water. 
Persons driving cars disturbed. (0.25-0.30 g) 

IX Shaking violent, heavy damage: damage considerable in specially designed structures; well-designed frame structures thrown out 
of plumb; great in substantial buildings, with partial collapse. Buildings shifted off foundations. Ground cracked conspicuously. 
Underground pipes broken. (0.50-0.55 g) 

X Shaking extreme, very heavy damage: some well-built wooden structures destroyed; most masonry and frame structures destroyed 
with foundations; ground badly cracked. Rails bent. Landslides considerable from river banks and steep slopes. Shifted sand and 
mud. Water splashed, slopped over banks. (More than 0.60 g) 

XI Few, if any, (masonry) structures remain standing. Bridges destroyed. Broad fissures in ground. Underground pipelines completely 
out of service. Earth slumps and land slips in soft ground. Rails bent greatly. 

XII Damage total. Waves seen on ground surfaces. Lines of sight and level destroyed. Objects thrown into the air. 


Note: Parentheses show the average peak acceleration in terms of g (9.8 m/s), taken from Bolt (1999). 
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Fig. 1.2-4 Isoseismals for the first of the 
three largest earthquakes of the 1811-12 
New Madrid earthquake sequence. Such 
plots, though based on sparse data, often 
provide the best assessment of historical 
earthquakes and of the effects of future 
ones. (After Hough et al, 2000./. Geophys. 
Res., 105, 23,839-64, Copyright by the 
American Geophysical Union.) 
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I ; i". 1.2-5 Comparison of the predicted strong ground motion as a 
Amction of distance from magnitude 7 and 6 earthquakes in the eastern 
and western United States. Shaking from an earthquake in the east is 
comparable to that from one a magnitude unit larger in the west. The 
curves are computed from models by Atkinson and Boore (1995) and 
Sadigh etal. (1997). 





Fig. 1.2-7 Five-story building in Spitak, Armenia, destroyed during the 
December 7,1988, earthquake. The building was made from precast 
concrete frames that were inadequately connected. The failure of such 
buildings contributed greatly to the loss of 25,000 lives. (Courtesy of the 
US Geological Survey.) 



Modified Mercalli intensity 

big. 1.2-6 Approximate percentage of buildings that collapse as a 
i unction of the intensity of earthquake-related shaking. The survival of 
buildings differs greatly for constructions of weak masonry, fired brick, 
timber, and reinforced concrete (with and without anti-seismic design). 

(A fter Coburn and Spence, Earthquake Protection, © 1992. Reproduced 
by permission of John Wiley &C Sons Limited.) 


construction that makes economic sense. Countries like the 
USA and Japan have the financial resources to study the effects 
of shaking on buildings, develop codes of appropriate building 
construction, and build structures to meet those codes. The 
task for building codes is to not be too weak, permitting unsafe 
construction and undue risks, or too strong, imposing un- 
neecled costs and encouraging their evasion. Deciding where 
to draw this line is a complex policy issue for which there 
is no unique answer. Making the appropriate decisions is 
even more difficult in developing nations, many of which 
Dee serious hazards but have even larger alternative demands 
lor resources that could be used for seismic safety. A classic 


example is the choice between building schools for towns 
without them or making existing schools earthquake-resistant. 

A related issue is ensuring that buildings are built to the 
codes, given the tendency to evade expensive regulations de¬ 
signed to deal with events that are infrequent on a human time 
scale. For example, much damage occurred during large earth¬ 
quakes in Turkey in 1999 because the building codes were not 
enforced. It has been reported that walls crumbled, revealing 
empty olive oil cans inserted during construction to save the 
costs of concrete. 

Much of what has been learned about safe construction has 
been via trial and error. In California, the first major set of 
building codes was enacted following the 1933 Long Beach 
earthquake, which did $41 million worth of damage and killed 
120 people. With successive destructive earthquakes, engineers 
have acquired a better sense of what works best, and build¬ 
ing codes have been modified. For instance, buildings have 
become more resistant to the lateral shear that accompanies 
horizontal shaking with the use of shear walls consisting of 
concrete reinforced with steel. Similarly, measures have been 
developed to retrofit older buildings to increase their earth¬ 
quake resistance. 

An important factor for earthquake engineers is that struc¬ 
tures resonate at different periods. Although the resonant 
period or periods depend on the specific building geometry and 
materials, they generally increase with an increase in the height 
or base width of a building. For example, typical houses or 
small buildings have periods of about 0.2 s, whereas a typical 
10-story building has a period around 1 s. If the peak energy of 
ground motion is close to a building’s resonant period, and the 
shaking continues long enough, the building may undergo 
large oscillations and be seriously damaged. This effect is 
like a swing — pushing at random intervals will likely stop 
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the swing, whereas pushing repeatedly at its resonant period 
gives the person on it a good ride. Through this mechanism, 
an earthquake can destroy certain buildings and not others. 
Similarly, a building might collapse after a magnitude 7 earth¬ 
quake, but remain standing after a magnitude 8 event with 
peak energy at a lower frequency. Sometimes damage occurs 
because adjacent buildings resonate out of phase, making their 
tops collide. 

Another crucial factor for earthquake-resistant construction 
is the ground material of the site. Loose sediments and other 
weak rocks at the surface enhance ground motion compared 
to bedrock sites. As shown in Section 2.4.5, near-surface 
sediments can increase ground displacements by more than 
an order of magnitude. For instance, during the 1989 Loma 
Prieta earthquake, areas that sustained the worst damage 
corresponded to ones of high risk identified on the basis of 
subsurface geology. The failures of buildings in the Marina 
district, the Bay Bridge, and the Nimitz freeway all occurred 
on sedimentary layers. 

An example of these effects occurred in 1985 in Mexico City, 
which is built on the sedimentary fill of an ancient lake that has 
dried up since the time of the Aztecs. A magnitude 7.9 earth¬ 
quake at the subduction zone to the west caused the sediment¬ 
ary basin to shake for more than 3 minutes (an unusually long 
time) at a dominant period of about 2 s. The worst damage was 
sustained by buildings with 6-15 stories, which had resonant 
periods of 1-3 s. Shorter or taller buildings were less damaged 
because they did not resonate with the ground shaking. This 
damage pattern has repeated for successive earthquakes. 

1,23 Highways, bridges, dams, and pipelines 

Buildings are not the only challenge for earthquake-resistant 
construction. Highways, bridges, parking structures, land¬ 
fills, dams, pipelines, and power plants present additional 
problems. Many of these structures are crucial to society, so 
considerable effort is made to ensure that they will survive 
earthquakes. 

Elevated highways often fail during earthquakes. Most of 
the lives lost during the 1989 Loma Prieta earthquake were due 
to the collapse of the Nimitz freeway in Oakland. In Los Ange¬ 
les, the 1-5 freeway was built to withstand a large earthquake, 
but parts were destroyed during the 1971 San Fernando earth¬ 
quake. These were rebuilt, but parts collapsed again during the 
1994 Northridge shock. A dramatic highway failure occurred 
during the 1995 Kobe earthquake, when a 20 km length of 
an expressway supported by large concrete piers fell over, 
crushing many cars and trucks. 

Similar problems beset bridges, as illustrated in the 1989 
Loma Prieta earthquake. The Bay Bridge connecting San Fran¬ 
cisco and Oakland is a double-deck bridge built in 1936 with 
little flexibility and rests on sedimentary rocks. A large piece of 
the upper span collapsed during the earthquake (Fig. 1.2-8), 
and the bridge was closed for months for repairs. By contrast, 
the Golden Gate Bridge, a suspension bridge built into bed- 



Fig. 1.2-8 Damage to the Bay Bridge, connecting San Francisco 
and Oakland, from the October 17,1989, Loma Prieta earthquake. 

The bridge is of old construction (1936), and its supports rest in 
sedimentary fill that amplifies ground shaking. (Courtesy of the 
US Geological Survey.) 

rock, was designed to withstand a large amount of shaking and 
fared well. 

The failure of dams due to earthquakes poses considerable 
risk, as illustrated by the near-failure of the lower Van Norman 
dam during the 1971 San Fernando earthquake. A segment of 
the dam 600 m long broke and slid into the reservoir (Fig. 1.2- 
9), lowering the dam by 10 m and leaving it only 1.5 m above 
the water. Fortunately, the area had been suffering from a 
drought, and the reservoir was only half full. Eighty thousand 
people living below the dam were evacuated, and the reser¬ 
voir was quickly drained. The dam was replaced by a more 
modern dam that suffered only minor cracking during the 
1994 Northridge earthquake. 

Dams have the special problem that they can cause earth¬ 
quakes. This seems counter-intuitive, because the added weight 
of the water should increase the pressure on the rock below and 
inhibit faulting, because the two sides of the fault are pressed 
together harder, requiring a greater force to overcome the 
friction. However, it seems that the water impounded by dams 
sometimes flows into the rock, lowering the friction across 
faults and making rupture easier. The effect can be noticeable; 
seismicity associated with the man-made lake in Koyna, India, 
seems to follow a seasonal curve, being more active follow¬ 
ing the rainy season when reservoir levels are higher. One 
earthquake in 1967 was large enough to kill 200 people. The 
possibility of reservoir-induced earthquakes is thus considered 
when designing dams. 

The greatest cause of earthquake-related death and destruc¬ 
tion, other than the collapse of buildings, is fire. An important 
contributor to this problem is that water pipelines can rupture, 
making fire fighting harder. In the 1906 San Francisco earth¬ 
quake, many buildings were damaged by the shaking, but fires 
that lasted three days are thought to have done ten times more 
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; 1.2-9 Failure of the lower Van Norman dam that occurred during the 
ebruary 9,1971, San Fernando valley earthquake. Flooding did not 
ccur because the region had been experiencing a drought, and the water 
■1 was low. (Courtesy of the US Geological Survey.) 

damage (Fig. 1.2-10). Following the 1923 Tokyo earthquake, 
ires caused by overturned cooking stoves spread rapidly 
through the city and were unstoppable, due to ruptured water 
ipes. Many of the over 140,000 deaths resulted from fire, 
chiding a fire storm that engulfed 40,000 people who fled to 
an open area to escape collapsing buildings. In modern cities, 
natural gas pipelines can rupture, allowing flammable gas to 
cape and ignite. After the 1994 Northridge and 1995 Kobe 
rthquakes, both of which happened at night, the wide 
ithreaks of fires were the first way that rescue efforts could 
identify the areas that sustained the greatest damage. People in 
earthquake-prone areas are taught to turn off the gas supply to 
their homes if they smell gas after a large earthquake. 


Fig. 1.2-11 Aerial view of Valdez, Alaska, showing the inundation of the 
coastline following the great 1964 earthquake. The resulting tsunami was 
as high as 32 m in places. (National Geophysical Data Center. Courtesy of 
the US Department of the Interior.) 


1.2.4 Tsunamis, landslides, and soil liquefaction 

Spectacular exceptions to the truism that “earthquakes don’t 
kill people, buildings kill people” include tsunamis, landslides, 
avalanches, and soil liquefaction. Earthquake hazard planning 
thus includes identifying sites where these risks are present. 

Tsunamis are large water waves that occur when portions of 
the sea floor are displaced by volcanic eruptions, submarine 
landslides, or underwater earthquakes (Fig. 1.2-11). Tsunamis 
are not noticeable as they cross the ocean, but can be amplified 
dramatically upon reaching the shore. The 1896 Sanriku 
(Japan) earthquake caused 35 m-high tsunamis that washed 
away 10,000 homes and killed 26,000 people. Hawaii is espe¬ 
cially susceptible to tsunamis from earthquakes around the 
Pacific rim. Tsunamis from the 1960 Chilean earthquake killed 
61 people in Hawaii, and the 1946 Alaska earthquake created 
a 7 m-high tsunami that washed over and short-circuited a 
power station, plunging Hilo into darkness. To address these 
risks, tsunami warning systems have been developed that assess 


• ig. 1.2-10 Fires burning in San Francisco 
i'ye hours after the April 18,1906, 
earthquake. Many buildings received little 
carnage from the earthquake, but were 
destroyed by the fires that burned out of 
'nrrol for three days. (Courtesy of the 
hhttional Geophysical Data Center.) 
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Fig. 1.2-12 Landslide along California State Highway 17 in the Santa 
Cruz mountains, caused by shaking from the 1989 Loma Prieta 
earthquake. The landslide blocked the major commuter route between 
Santa Cruz and San Jose. (Courtesy of the US Geological Survey.) 


the likelihood that a large earthquake will generate a tsunami 
and issue warnings before the tsunami reaches distant areas. 

Ground shaking in areas with steep topography can cause 
destructive landslides and avalanches (Fig. 1.2-12). For ex¬ 
ample, a 1970 earthquake in Peru caused rock and ice land¬ 
slides that traveled downhill at speeds of 300 km/hr, burying 
villages and killing 30,000 people. 

Another earthquake hazard involves liquefaction , a process 
by which loose water-saturated sands behave like liquids when 
vigorously shaken. Under normal conditions, the sand grains 
are in contact with each other, and water fills the pore spaces 
between them. Strong shaking moves the grains apart, so the 
soil behaves like a fluid slurry similar to “quicksand.” Build¬ 
ings can sink, otherwise undamaged, during the few seconds of 
peak ground shaking, and end up permanently stuck when the 
shaking stops and the soil resolidifies. A classic example is the 
tilting and sinking of buildings in Niigata, Japan, during a 
1964 earthquake (Fig. 1.2-13). 

Ground consisting of loose wet sediment is most suscept¬ 
ible to liquefaction. Sometimes the sand is ejected out of the 
surface as sand blows . This happened in the Marina district of 
the San Francisco waterfront during the 1989 Loma Prieta 
earthquake. Ironically, some of the material that erupted from 
the ground was building rubble from the 1906 San Francisco 
earthquake that had been bulldozed into the bay to make new 
waterfront property. 

Liquefaction can be widespread and devastating, involving 
large downslope movements of soil called lateral spreading. In 
the 1920 Kansu, China, earthquake, downslope flows traveled 
over 1.5 km, killing 180,000 people. During the 1964 Alaska 
earthquake, parts of the Turnagain Heights section of Anchor¬ 
age liquefied and collapsed. A dramatic example occurred on 



Fig. 1.2-13 Damage to apartment buildings caused by soil liquefaction 
during the June 16,1964, Niigata (Japan) earthquake. About a third of 
the city sank by as much 2 m as a result of sand compaction. (Courtesy 
of the National Geophysical Data Center.) 


the island of Jamaica due to a magnitude 8 earthquake in 1692, 
where much of the town of Port Royal, built upon sand, sank 
about 4 m beneath the ocean. For years afterward, people on 
boats in the harbor could see houses below. 

1.2.5 Earthquake forecasting 

Reducing earthquake risks via resistant construction relies on 
identifying regions prone to earthquakes and estimating, even 
if crudely, how likely earthquakes are to occur and what shak¬ 
ing they might produce. Thus earthquake forecasting involves 
both scientific issues and the related question of how society 
can best use what seismology can provide. 

Before addressing the predictions of earthquakes, it is useful 
to consider predictions for other geophysical processes. For 
example, severe storms are predicted in several ways. The first 
are long-term average forecasts: Chicagoans expect winter 
snowstorms, whereas Miamians expect fall hurricanes. Public 
authorities, power companies, homeowners, and businesses use 
the historical record of storms to prepare for them. Although 
surprises occur, long-term forecasting is generally adequate to 
ensure that needed resources (snow plows, salt) are available, 
whereas funds are not wasted on unneeded preparations (snow 
plows in Miami). Second, short-term weather forecasting often 
can identify conditions under which a storm is likely to form 
soon. Third, once formed, storms are tracked in real time , 
so people are often warned a day or more in advance to make 
preparations. 

Similarly, volcanic hazard assessment begins with the loca¬ 
tion of volcanoes that are active or have been so recently (in 
geological terms). Based on the eruption history taken from 
historical accounts and the geologic record, long-term forecasts 
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be made. Short-term predictions are made using various 
Vhenomena that precede major eruptions: rising magma causes 
!r- ( )und deformation, small earthquakes, and the release of 
-olcanic gases. Finally, small eruptions usually precede a large 
‘ making it possible to issue real-time warnings. Hence the 
-caird of volcanic predictions, though not perfect, 8 is reason- 
up. aood. The area around Mt St Helens was evacuated before 
[he giant eruption of May 18, 1980, reducing the loss of life 
to only 60 people, including a geologist studying the volcano 
md citizens who refused to leave. The largest eruption of 
the second half of the twentieth century, Mt Pinatubo in the 
Philippines, destroyed over 100,000 houses and a nearby US 
\ir Force base, yet only 281 people died because of evacuations 
Curing the preceding days. 

Seismologists would like to do as well for earthquakes. We 
would like to be able to forecast where they are on average 
likely to occur in years to come, predict them a few years to 
hours before they occur, and issue real-time warnings after an 
earthquake has occurred in situations where such a warning 
would be useful. However, the record of seismology in these 
areas- is-mixed. To date there has been some success in long¬ 
term forecasting, little if any in short-term prediction, and 
some in real-time warning. 

Farthquake forecasting, discussed in Section 4.7.3, estimates 
die probability that an earthquake of a certain magnitude will 
occur in a particular area during a specific time. For instance, 
a forecast might be a 25% probability of a magnitude 7 or 
greater earthquake occurring along the San Francisco segment 
of die San Andreas fault in the next 30 years. Forecasting uses 
die history of earthquakes on the fault and other geophysical 
information, such as the crustal motions measured using the 
C ilohal Positioning System, to predict its likely future behavior. 
While forecasting is not relevant to short-term earthquake 
preparations, it is important in the enactment of building codes 
tor earthquake-resistant construction, which are costly and 
require justification. Such forecasting is already successful in 
general ways; knowing that the San Andreas and nearby faults 
will be the sites of recurrent earthquakes has prompted build¬ 
ing codes that are a major reason why the 1989 Loma Prieta 
mu! 1994 Northridge earthquakes caused few casualties. 

Going beyond general forecasts is more difficult. For ex¬ 
ample, the probabilistic hazard map for the USA in Fig. 1.2-3 
predicts a general pattern of higher hazards in areas of known 
past large earthquakes. Most of these, in California and 
Nevada, the Pacific Northwest, and Utah, are in the western 
ldSA, in the broad boundary zone between the Pacific and 
North American plates. In addition, high hazards are predic¬ 
ted in parts of the interior of the continent, near Charleston, 


In 1982, uplift of the volcanic dome and other activity near the resort town of 
- ranimoth Lakes, California, suggested that an eruption might be imminent. Geolo- 
Shts issued a volcano alert, resulting in significant tensions with local business leaders. 

nen no eruption occurred, geologists were the target of much local anger, and the 
y unity supervisor who arranged for an escape route in the event of a volcanic eruption 
rec alled in a special election. 


South Carolina, and the New Madrid seismic zone in the 
Midwest. The map attempts to quantify this risk in terms of the 
maximum expected acceleration (recall that 0.2 g corresponds 
approximately to the onset of significant building damage) 
during a time interval. Such maps are made by assuming where 
and how often earthquakes will occur, how large they will be, 
and then using ground motion models like those in Fig. 1.2-5 to 
predict how much ground motion they will produce. Because 
these factors are not well understood, especially in intraplate 
regions where large earthquakes are rare, hazard estimates 
have considerable uncertainties. 9 For example, the high hazard 
predicted for parts of the Midwest, exceeding that in San 
Francisco or Los Angeles, results from specific assumptions, 
and alternative assumptions yield quite different estimates 
(Fig. 1.2-14). 

Similarly, hazard estimates depend on the probability and 
hence recurrence time considered. Where the largest earth¬ 
quakes are expected about every 200 years ■— for example, near 
a plate boundary as in California — a hazard map predicting 
the maximum acceleration expected at a 10% probabil¬ 
ity in the next 50 years, or at least once during the next 500 
(50/0.1) years, will be similar to one for 2% probability in the 
next 50 years, or at least once during the next 2500 (50/0.02) 
years, because each portion of plate boundary is expected to 
rupture at least once in 500 years. However, the two maps 
would differ significantly where large earthquakes are less 
frequent — for example, in an intraplate region like the New 
Madrid zone (Sections 4.7.1, 5.6.3). This issue is important in 
choosing building codes because typical buildings have a useful 
life of about 50 years. 

Because earthquakes are infrequent on a human time scale, 
it will be a long time before we know how well such estimates, 
which combine long-term earthquake forecasts and ground 
motion predictions, actually describe future earthquakes. 
Nonetheless, such estimates are used for purposes such as 
developing building codes and setting insurance rates. As a 
result, how to make meaningful predictions and hazard estim¬ 
ates, communicate their uncertainties to the public, and best 
use them for policy is a topic of discussion relevant not just 
to seismology but to the other earth sciences as well. 

A key scientific challenge for hazard estimation is that the 
process determining when large earthquakes recur is unclear. 
The underlying basis for seismic forecasting is the principle of 
elastic rebound (Section 4.1). In this model, large-scale crustal 
motions, in most cases due to plate motions, slowly build up 
stress and strain across locked faults. When the stress reaches 
a critical threshold, seismic slip occurs along the fault, and the 
stress immediately drops. The process then begins again. The 
repeat time for these earthquakes depends on the rate at which 
crustal motions load the fault and the properties of the rocks 
that control when it slips. 


9 Earthquake risk assessment has been described as “a game of chance of which we 
still don’t know all the rules” (Lomnitz, 1989). 
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Fig, 1.2-14 Comparison of the predicted 
seismic hazard {peak ground acceleration 
expected at 2% probability in 50 years) 
from New Madrid seismic zone 
earthquakes for alternative parameter 
choices. Rows show the effect of varying 
the magnitude of the largest expected New 
Madrid fault earthquakes from 8 to 7, 
which primarily affects the predicted 
acceleration near the fault. Columns show 
the effect of two different ground motion 
models (“Frankel” and “Toro”) which 
affect the predicted acceleration over a 
larger area. {Newman etal, 2001. 

© Seismological Society of America. 
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This idea implies that the history of large past earthquakes 
in an area should indicate the probable time of the next one. 
Naturally, the longer the history available, the better. Unfortun¬ 
ately, the duration of earthquake cycles is typically long com¬ 
pared to the approximately 100-year history of instrumental 
seismology. In some parts of the world, like China and Japan, 
historical records extend well into the past, whereas in the 
USA, the historic record is shorter. The earthquake history can 
be extended by paleoseismology , a branch of geology that 
studies the past history of faults. One of the best examples is the 
use of geological data to infer the history of large earthquakes 
on a major southern segment of the San Andreas fault. The last 
major earthquake recorded at a site at Pallett Creek, Califor¬ 
nia, the 1857 Fort Tejon earthquake, is known from historical 
records to have caused shaking with an intensity of XI. The 


faulting is recorded by disruptions of sedimentary strata, 
including sand blows where material erupted during the 
earthquake. Sand blows and other structures from previous 
earthquakes were dated with radiometric carbon-14 methods, 
giving the dates of previous earthquakes. Despite the many 
uncertainties involved with these methods, including uncer¬ 
tainties in radiometric dating and the effects of climate varia¬ 
tions and burrowing animals, the data show that faulting has 
recurred over the past thousands of years. However, assessing 
the size of past earthquakes and whether some earthquakes 
were missed is difficult. 

The results can be surprising. For instance, large earthquakes 
near Pallett Creek appear to have occurred approximately in 
the years 1857,1812,1480,1346,1100,1048, 997, 797, 734, 
and 671. Because the average time between events is 132 years, 









AyV/V'*' 


1.2 Seismology and society 23 



Fig. 1.2-15 Paleoseismic time series of earthquakes along the San Andreas 
fault near Pallett Creek, California, inferred from sedimentary deposits by 
Sieh etal. (1989). The sequence shows earthquake clusters separated by 
longer time intervals, illustrating the complexity of earthquake recurrence. 
(Keller and Pinter, Active Tectonics: earthquakes, uplift, and the 
landscape , © 1996. Reprinted by permission of Pearson Education.) 

we might have expected the next large earthquake around the 
year 1989. However, the intervals between earthquakes vary 
from 45 years to 332 years, with a standard deviation of 105 
years. Thus, given these data right after the 1857 earthquake, 
the simplest view would be that the earthquake would likely 
recur between 1885 and 2093. However, the time history sug¬ 
gests that something more complicated is going on (Fig. 1.2- 
15), as illustrated by the fact that the standard deviation of the 
recurrence time is similar to its mean. It looks as if the earth¬ 
quakes are clustered: three earthquakes between 671 and 797, 
then a 200-year gap, then three between 997 and 1100, fol¬ 
lowed by a 246-year gap. Hence, using the earthquake history 
to forecast the next big earthquake is challenging, and the 
study s authors concluded in 1989 that one could estimate 
the probability of a similar earthquake before 2019 as only 
somewhere in the range 7-51 %. For example, if the cluster that 
included the 1812 and 1857 earthquakes is over, then it may be 
a long time until the next big earthquake there. 

The variability of recurrence times is striking because these 
data span for a long time history (10 earthquake cycles) on a 
plate boundary where the plate motion causing the earthquake 
is steady. The history of most faults is known only for the past 
few cycles, and the Pallett Creek data imply that these may not 
be representative of the long-term pattern. The recurrence may 
be even more complicated for earthquake zones within plates, 


many of which seem to act for only a few earthquake cycles, 
and others of which may be one-time events. Research, some 
of which is discussed in Section 5.7, is going on to investigate 
this complexity. 

Even with the dates of previous major earthquakes, it is diffi¬ 
cult to predict when the next will occur, as illustrated by the 
segment of the San Andreas fault near Parkfield, California. 
Compared to the southern segment just discussed, or the north¬ 
ern segment on which the 1906 earthquake occurred, the 
Parkfield segment is characterized by smaller earthquakes that 
occur more frequently and appear much more periodic. Earth¬ 
quakes of magnitude 5-6 occurred in 1857,1881,1901,1922, 
1934, and 1966. The average recurrence interval is 22 years, 
and a linear fit to these dates made 1988 the likely date of the 
next event. In 1985, it was predicted at the 95% confidence 
level that the next Parkfield earthquake would occur before 
1993, which was the USA’s first official earthquake prediction. 
A comprehensive observing system was set up to monitor elec¬ 
trical resistivity, magnetic field strength, seismic wave velocity, 
microseismicity, ground tilting, water well levels and chem¬ 
istry (especially radon content), and motion across the fault. 
The well-publicized experiment 10 hoped to observe precursory 
behavior, which seemed likely because surface cracks were 
observed 10 days before the 1966 earthquake and a pipeline 
ruptured 9 hours before the shock, and to obtain detailed 
records of the earthquake at short distances. As of 2002, the 
earthquake had not yet happened, making the current interval 
(35 years and growing) the longest yet observed between earth¬ 
quakes there. The next Parkfield earthquake will eventually 
occur, but its non-arrival to date illustrates both the limitations 
of the statistical approaches used in the prediction (including 
the omission of the 1934 earthquake on the grounds that it 
was premature and should have occurred in 1944) and the fact 
that even in the best of circumstances nature is not necessarily 
cooperative or easily predicted. For that matter, it is unclear 
whether the Parkfield segment of the San Andreas fault shows 
such unusual quasi-periodicity because it differs from other 
parts of the San Andreas fault (in which case predicting earth¬ 
quakes there might not be that helpful for other parts), or 
whether it results simply from the fact that, given enough time 
and different fault segments, essentially random seismicity can 
yield apparent periodicity somewhere. As is usual with such 
questions, only time will tell. 

Such seismic forecasting involves the concept of seismic 
gaps, discussed further in Sections 4.7.3 and 5.4.3. The idea is 
that a long plate boundary like the San Andreas or an oceanic 
trench ruptures in segments. We would thus expect steady plate 
motion to cause earthquakes that fill in gaps and occur at 
relatively regular intervals. However, the Pallett Creek and 


The costs involved (more than $30 million) led The Economist magazine 
(Aug. 1, 1987) to argue that “Parkfield is geophysics 5 Waterloo. If the earthquake 
comes without warnings of any kind, earthquakes are unpredictable and science is 
defeated. There will be no excuses left, for never has an ambush been more carefully 
laid.” 
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Fig. 1.2-16 Cross-section of the seismicity along the San Andreas fault before (top) and after (bottom) the 1989 Loma Prieta earthquake. This earthquake, 
whose rupture began at the large circle in the lower figure and is marked by the aftershocks (small circles), has been interpreted as filling a seismic gap along 
the San Andreas fault, although other interpretations have also been made. (Courtesy of the US Geological Survey.) 


Parkfield examples show that the earth is more complicated. 
Some earthquakes may fit the gap idea; the 1989 Loma Prieta 
earthquake and its aftershocks have been interpreted as filling 
a gap along the San Andreas fault (Fig. 1.2-16), although the 
fact that the earthquake differed from the expected fault 
geometry has also been interpreted as making it different from 
the expected gap-filling earthquake. In other areas, however, 
the gap hypothesis has not yet proved successful in identifying 
future earthquake locations significantly better than random 
guessing. Faults deemed likely to rupture have not done so, 
and earthquakes sometimes occur on faults that were either 
unknown or considered seismically inactive. Understanding if, 
where, and when the gap hypothesis is useful is thus an active 
research area. Until it is resolved, it is unclear whether it is 
better to assume that all segments of a given fault are equally 
likely to rupture, making the probability of a major earthquake 
independent of time, or whether the segment that ruptured 
longest ago should have since accumulated the greatest elastic 
strain, and therefore be most likely to rupture next. This issue is 
important for hazard estimates. 

In summary, several factors make earthquake forecasting 
difficult. In the meteorological case, storms occur frequently on 
human time scales, and we believe that we understand their 
basic physics. By contrast, the cycle of earthquakes on a given 
fault segment is long on a human time scale. Thus there are 
only a few places with a time history long enough to formulate 
useful hypotheses (recall that even the Pallett Creek 1000-year 
history shows major complexity). Moreover, because forecasts 
must be tested by their ability to predict future earthquakes, a 
long time will be needed to convincingly test models of earth¬ 
quake recurrence and hazards. Even worse, the fundamental 
physics of earthquake faulting is not yet understood. Clearly, 


the process is complex. Earthquakes are at best only crudely 
periodic, and sometimes appear instead to cluster in time. 
Faults display a continuum of behavior from locking, to slow 
aseismic creep, to earthquakes. Thus the theoretical and ex¬ 
perimental study of rock deformation and its application to 
earthquake faulting is an active field of research (Section 5.7). 

1.2.6 Earthquake prediction 

Earthquake prediction is defined as specifying within certain 
ranges the location, time, and size of an earthquake a few years 
to days before it occurs. Prediction is an even more difficult 
problem than long-term forecasting. A common analogy is that 
although a bending stick will eventually snap, it is hard to pre¬ 
dict exactly when. To do so requires either a theoretical basis 
for knowing when the stick will break, given a history of the 
applied force, or observing some change in physical properties 
that immediately precedes the stick’s failure. 

Because little is known about the fundamental physics of 
faulting, many attempts to predict earthquakes have searched 
for precursors , observable behavior that precedes earthquakes. 
To date, as discussed next, this search has proved generally un¬ 
successful. As a result, it is unclear whether earthquake predic¬ 
tion is even possible. In one hypothesis, all earthquakes start off 
as tiny earthquakes, which happen frequently, but only a few 
cascade via a random failure process into large earthquakes. 11 

11 This hypothesis draws on ideas from nonlinear dynamics or chaos theory, in 
which small perturbations can grow to have unpredictable large consequences. These 
ideas were posed in terms of the possibility that the flap of a butterfly’s wings in Brazil 
might set off a tornado in Texas, or in general that minuscule disturbances do not 
affect the overall frequency of storms but can modify when they occur (Lorenz, 1993). 
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In this view, because there is nothing special about those tiny 
earthquakes that happen to grow into large ones, the interval 
between large earthquakes is highly variable, and no observ¬ 
able precursors should occur before them. If so, earthquake 
prediction is either impossible or nearly so. 

Support for this view comes from the failure to observe a 
compelling pattern of precursory behavior before earthquakes. 
Various possible precursors have been suggested, and some 
may have been real in certain cases, but none have yet proved to 
be a general feature preceding all earthquakes, or to stand out 
convincingly from the normal range of the earth’s variable 
behavior. Although it is tempting to note a precursory pattern 
after an earthquake based on a small set of data and to suggest 
that the earthquake might have been predicted, rigorous tests 
with large sets of data are needed to tell whether a possible 
precursory behavior is real and correlates with earthquakes 
more frequently than expected purely by chance. Most cru¬ 
cially, any such pattern needs to be tested by predicting future 
earthquakes. 

One class of precursors involves foreshocks , earthquakes 
that occur before a main shock. Many earthquakes, in hind¬ 
sight, have followed periods of anomalous seismicity. In some 
cases, there is a flurry of microseismicity: very small earth¬ 
quakes like the cracking that precedes a bent stick’s snapping. 
In other cases, there is no preceding seismicity. However, faults 
often show periods of either elevated or nonexistent micro¬ 
seismicity that are not followed by a large earthquake. Altern¬ 
atively, the level of microseismicity before a large event can 
be unremarkable, occurring at a normal low level. The lack of 
a pattern highlights the problem with possible earthquake pre¬ 
cursors: to date, no changes that might be associated with an 
upcoming earthquake are consistently distinguishable from the 
normal variations in seismicity that are not followed by a large 
earthquake. 

Another class of possible precursors involves changes in the 
properties of rock within a fault zone preceding a large earth¬ 
quake. It has been suggested that as a region experiences a 
buildup of elastic stress and strain, microcracks may form and 
fill with water, lowering the strength of the rock and eventually 
leading to an earthquake. This effect has been advocated based 
on data showing changes in the level of radon gas, presumably 
reflecting the development of microcracks that allow radon 
to escape. For example, the radon detected in groundwater 
rose steadily in the months before the 1995 Kobe earthquake, 
increased further two week before the earthquake, and then 
returned to a background level (Fig. 1.2-17). 

A variety of similar observations have been reported. In 
some cases, the ratio of P- and 5-wave speeds in the region of an 
earthquake has been reported to have decreased by as much as 
10% before an earthquake. Such observations would be con¬ 
sistent with laboratory experiments, and would reflect cracks 
opening in the rock (lowering wave speeds) due to increasing 
stress and later filling (increasing wave speeds). However, 
this phenomenon has not been substantiated as a general phe¬ 
nomenon. Similar difficulties beset reports of a decrease in the 
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Fig. 1.2-17 Radon within groundwater before and after the January 16, 
1995, Kobe earthquake in Japan. (Igarashi etaL, 1995. Reprinted with 
permission from Science, 269, 60-1. Copyright 1995, American 
Association for the Advancement of Science.) 


electrical resistivity of the ground before some earthquakes, 
consistent with large-scale microcracking. Changes in the 
amount and composition of groundwater have also been ob¬ 
served. For example, a geyser in Calistoga, California, changed 
its period between eruptions before the 1989 Loma Prieta and 
1975 Oroville, California, earthquakes. 

Efforts have also been made to identify ground deformation 
immediately preceding earthquakes. The most famous of these 
studies was the report in 1975 of 30-45 cm of uplift along 
the San Andreas fault near Palmdale, California. This highly 
publicized “Palmdale Bulge” was interpreted as evidence of an 
impending large earthquake and was a factor in the US govern¬ 
ment’s decision to launch the National Earthquake Hazards 
Reduction Program aimed at studying and predicting earth¬ 
quakes. However, the earthquake did not occur, and reanalysis 
of the data implied that the bulge had been an artifact of errors 
involved in referring the vertical motions to sea level via a 
traverse across the San Gabriel mountains. Subsequent studies, 
using newer and more accurate techniques including the 
Global Positioning System satellites, satellite radar interfero¬ 
metry, and borehole strainmeters have not yet convincingly 
detected precursory ground deformation. 

An often-reported precursor that is even harder to quantify 
is anomalous animal behavior. What the animals are sensing 
(high-frequency noise, electromagnetic fields, gas emissions) is 
unclear. Moreover, because it is hard to distinguish “anoma¬ 
lous” behaviors from the usual range of animal behaviors, 
most such observations have been “postdictions,” coming 
after rather than before an earthquake. 

Despite these difficulties, Chinese scientists are attempting to 
predict earthquakes using precursors. Chinese sources report 
a successful prediction in which the city of Haicheng was 
evacuated in 1975, prior to a magnitude 7.4 earthquake that 
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damaged more than 90% of the houses. The prediction is said 
to have been based on precursors, including ground deforma¬ 
tion, changes in the electromagnetic field and groundwater 
levels, anomalous animal behavior, and significant foreshocks. 
However, in the following year, the Tangshan earthquake 
occurred not too far away without precursors. In minutes, 
250,000 people died, and another 500,000 people were 
injured. In the following month, an earthquake warning in the 
Kwangtung province caused people to sleep in tents for two 
months, but no earthquake occurred. Because foreign scient¬ 
ists have been yet been able to assess the Chinese data and the 
record of predictions, including both false positives (predic¬ 
tions without earthquakes) and false negatives (earthquakes 
without predictions), it is difficult to evaluate the program. 

In summary, despite tantalizing suggestions, at present there 
is still an absence of reliable precursors. The frustrations of this 
search have led to the wry observation that “it is difficult 
to predict earthquakes, especially before they happen.” Most 
researchers thus feel that although earthquake prediction 
would be seismology’s greatest triumph, it is either far away 
or will never happen. However, because success would be of 
enormous societal benefit, the search for methods of earth¬ 
quake prediction will likely continue. 

1 .2.7 Real-time warnings 

Some recent efforts are directed to the tractable goal of real¬ 
time warnings, where seismometers trigger an immediate 
warning if a set of criteria is met. For tsunamis, the warning 
may be several hours in advance, which is enough time for 
preparations. This is because tsunamis travel more slowly 
than seismic waves. A F wave travels from Alaska to Hawaii 
in about 7 minutes, whereas a tsunami traveling at about 
800 km/hr across the ocean takes 5.5 hours. After the damage 
done to Hilo by the 1946 Alaska earthquake, the Seismic Sea 
Wave Warning System was organized for countries that rim the 
Pacific Ocean. Information from seismometers and tide gauges 
was phoned to the Tsunami Warning Center in Honolulu, 
Hawaii, which issued tsunami alerts if necessary. 12 Tsunami 
warning systems have since become more automated, using 
real-time digital seismic data to locate large earthquakes and 
derive information about their magnitudes, depths, and focal 
mechanisms. An assessment can be made of the likelihood of 
a tsunami, which usually results from vertical motion at the sea 
floor. 

The situation is much more complicated with seismic waves. 
Although local seismic networks can automatically and imme¬ 
diately locate an earthquake and assess if it is hazardous, the 
warning time is short. For example, a warning after a major 
earthquake on the New Madrid fault system instantly relayed 
via Internet or radio to St Louis would arrive about 40 seconds 

12 Serious or older television viewers may recall the episode of Hawaii 5-0 in which 
criminals force the center to issue a spurious tsunami warning to prompt evacuation 
of downtown Honolulu and facilitate a robbery. 


before the first seismic waves. Seismologists, engineers, and 
public authorities are thus discussing what might be done with 
such short warning times. Although such times would not per¬ 
mit evacuations, certain steps might be useful. For example, 
real-time warnings are used in Japan to stop high-speed trains, 
and it may be practical to have gas line shut-off valves or other 
automatic responses connected to such a system. The questions 
are whether the improved safety justifies the cost and whether 
the risk of false alarms is serious. 

A related approach is to provide authorities with near- 
real-time information, including data on the distribution of 
shaking, immediately after major earthquakes. Seismic net¬ 
works are working to provide emergency management services 
with information that can help direct the needed response to 
the most affected areas during the chaotic few hours after a 
large earthquake, when the location and extent of damage are 
often still unclear. 

1.2.8 Nuclear monitoring and treaty verification 

Another important societal application of seismology is the 
monitoring of nuclear testing. Although atomic physics destab¬ 
ilized world politics through the invention of the atomic bomb, 
seismology has partially restabilized it. Throughout the cold 
war between the USA and the Soviet Union, seismology helped 
verify that treaties were being observed. 

The role of seismology in nuclear monitoring began in 1957 
when the USA detonated RAINIER, the first underground 
nuclear explosion. By the early 1960s it became clear that 
radioactive elements produced by atmospheric nuclear testing 
posed significant health threats. In 1963, 116 nations signed 
the Limited Test Ban Treaty, which banned nuclear testing 
in the atmosphere, in the oceans, and in space, and required 
testing to occur underground. At about this time, the US Air 
Force helped fund the deployment of the World Wide Stand¬ 
ardized Seismographic Network (WWSSN). WWSSN stations 
provided important information for monitoring nuclear testing 
and a wealth of data that played a major role in modern geo¬ 
physical seismology. 

In 1976, countries began to abide by the Threshold Test Ban 
Treaty, which limited the size of underground nuclear tests to 
150 kt (equivalent to 150 kilotons of TNT). Before then, the 
largest atmospheric test had been 5 8 Mt, and the largest under¬ 
ground test had been 4.4 Mt. Figure 1.2-18 shows the yields es¬ 
timated seismologically for underground nuclear tests carried 
out by the Soviet Union. Although it was initially thought that 
some of the post-1976 explosions were greater than 150 kt, 
this turned out to reflect the different geologies of the western 
USA and central Asia. The conversion of seismic body wave 
magnitude m b values into TNT yields was calibrated using the 
Nevada test site, but the western US crust is more seismically 
attenuating than the more stable Soviet sites in Kazakhstan 
and Novaya Zemlya (see Section 3.7.10). The yields of 
explosions in kilotons, Y, can be related to the observed seismic 
magnitudes by 
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Fig. 1.2-18 Yields of underground nuclear tests carried 
out by the Soviet Union, determined through seismically 
observed m b magnitudes. After the Threshold Test Ban 
Treaty (TTBT), seismology verified that the Soviet 
Union was in general compliance with the 150-kiloton 
limit. Data courtesy of P. Richards (personal 
communication). 
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Fig. 1.2-19 Seismograms showing the differences 
between an earthquake and an explosion. For shallow 
earthquakes, in this case an m h 4.8 shock in India, the 
P wave is much smaller than the surface waves. By 
contrast, the initial P wave is the largest arrival for 
explosions like this Indian nuclear test. Data recorded 
at Nilore, Pakistan. (Courtesy of the Incorporated 
Research Institutions for Seismology.) 
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m fr = C+0.75 log Y, (2) 

but the constant differs for Nevada (C = 3.95) and Kazakhstan 
(C = 4.45). With these corrections, it appears that the Soviet 
Union complied with the treaty. 

Monitoring nuclear tests requires distinguishing them from 
earthquakes. Examples of the differences are shown in Fig. 1.2- 
19 for an earthquake and an explosion in India. Earthquakes 
occur by slip across a fault, generating large amounts of shear 
wave energy and hence large surface waves. By contrast, explo¬ 
sions involve motions away from the source, and so produce 
far less shear wave energy. Hence, for bombs the surface waves 
are dwarfed by the initial P wave. This difference is the basis for 
discrimination between earthquakes and explosions. A plot of 
M s vs m b (Fig. 1.2-20) separates earthquakes, which generate 


more surface wave energy (M s ), from the explosions, which 
generate more body (P) wave energy (m b ). 

The challenge of seismic monitoring has increased in recent 
years. Since 1996 the USA has abided by the Comprehensive 
Test Ban Treaty (CTBT), which bans all nuclear testing, pre¬ 
venting the development of new nuclear weapons. Thus the 
focus of US monitoring efforts has expanded to include smaller 
countries around the world. 13 There is also the need to identify 
possible smaller nuclear tests, including those by terrorists. 
Hence seismic monitoring must identify explosions less than 
1 kt, which have a magnitude of 4-4.5 (Eqn 2). This requires 
locating and identifying more than 200,000 earthquakes and 
additional mining explosions every year. 

13 A strategy described as “In God we trust, all others we verify. ” 
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Fig. 1.2-20 Body wave magnitudes (m b ) versus surface wave magnitude 
(M s ) and seismic moment (M 0 ) for a set of earthquakes and explosions in 
the western USA. Because the F waves of explosions are very large, as 
shown in the previous figure, they have anomalously high m b values for a 
given source energy {represented by M 0 ). A comparison of m b and M 0 can 
thus discriminate between earthquakes and nuclear explosions. {After Al- 
eqabi et al , 2001. © Seismological Society of America. All rights reserved.) 


An important part of this effort is the International Mon¬ 
itoring System (IMS), whose aim is to detect, locate, and ident¬ 
ify nuclear detonations that occur underground, underwater, 
or above ground. To do this, the IMS will combine seismolog¬ 
ical, hydroacoustic, and infrasound networks. Underwater 
nuclear tests create sound waves that travel efficiently through 
the ocean (Section 2.5.8), so a network of hydroacoustic 
stations will be established, with some sites using underwater 
hydrophones and others on islands to observe seismic phases 
that are generated when the oceanic acoustic waves reach 
land. Nuclear tests in the atmosphere will be detected by the 
infrasonic (frequencies less than 20 Hz, below the human 
hearing range) sound waves they generate. The IMS infrasound 
network will consist of small arrays of microphones that can 
determine the direction in which the infrasonic waves are 
traveling, so detection at multiple stations will identify the 
source of the waves. 

Because most clandestine tests would likely occur under¬ 
ground, seismic stations will be a vital part of the IMS. The IMS 
seismic network will have 50 primary stations with three- 
component broadband seismometers. About half of these sites 
will be augmented with local arrays of short-period vertical- 
component sensors. Data will be telemetered in real time, so 
that there is no delay in monitoring. An auxiliary network 


of 120 broadband stations, distributed over 61 countries and 
largely based on existing networks, will aid in discrimination 
and replace malfunctioning primary stations. 

Further reading 

The seismological topics introduced in this chapter are discussed elsewhere 
in the text, so references are given in the appropriate sections. Many other 
references exist for the topics of societal interest discussed here. 

Popular accounts of issues related to earthquakes include Gere and Shah 
(1984), Bolt (1999), and Brumbaugh (1999). Introductory treatments 
dealing with earthquakes and volcanoes from the point of view of the geo¬ 
logy and hazards include Alexander (1993), Kovach (1995), and Sieh and 
LeVay (1998). The World Wide Web contains a wealth of general earth¬ 
quake information; sites to start at include http://www.scec.org , http:// 
www.seismosoc.org , http://www.iris.edu, and http://earthquake.usgs.gov. 
Specific issues related to volcano prediction studies at Mammoth Lakes 
are discussed by Sieh and LeVay (1998) and Hill (1998). For discussions 
of paleoseismology and geological effects of earthquakes, see Keller and 
Pinter (1996) and Yeats et al. (1997). The role of seismology in the plate 
tectonic revolution is discussed by Cox (1973) and Menard (1986); the 
general idea of scientific revolutions as “paradigm shifts” is given by Kuhn 
(1962). 

Issues of assessing probabilities and uncertainties are discussed by 
Ekeland (1993); Henrion and Fischoff (1986) analyze the history of meas¬ 
urements of physical constants. Probabilistic seismic hazard analysis is dis¬ 
cussed by Reiter (1990), Hanks and Cornell (1994), and Hanks (1997). 
The US Geological Survey National Seismic Hazard maps are described 
by Frankel et al. (1996), and a global hazard map is described by Shedlock 
et al. (2000). Uncertainties in earthquake probabilities for California are 
discussed by Savage (1991). Real-time seismology applications to earth¬ 
quake risk mitigation are discussed by Kanamori et al. (1997). Sarewitz 
et al. (2000) discuss general issues of prediction and policy for the earth 
sciences, including earthquake prediction. Geschwind (2001) reviews the 
history of seismic risk mitigation and earthquake prediction policies in the 
USA. 

A considerable volume of scientific literature addresses earthquake pre¬ 
diction, often arguing whether either a specific approach or any method 
can predict earthquakes. Turcotte (1991) gives a general review of many 
aspects of the topic, and Geller (1997) summarizes the history of earth¬ 
quake prediction efforts, including that at Parkfieid and the Palmdale 
Bulge. Geller et al. (1997) and Evans (1997) argue that earthquakes are 
unpredictable; Lomnitz (1994), Wyss etal. (1997), and Sykes etal. (1999) 
argue the other side. The Parkfieid earthquake prediction experiment 
is summarized by Roeloffs and Langbein (1994); Davis et al. (1989) and 
Savage (1993) discuss the limitations of the statistical approach used. The 
controversy over the seismic gap hypothesis is discussed by Stein (1992); 
Kagan and Jackson (1991) and Jackson and Kagan (1993) argue against 
the hypothesis, and Nishenko and Sykes (1993) argue for it. 

Earthquake engineering is discussed by Bray (1995), Chopra (1995), 
Krinitzsky et al. (1993), and Wiegel (1970). A good World Wide Web site 
to start at is http://www.eeri.org , which also provides an introduction to 
earthquake insurance. Issues in natural disaster insurance are discussed by 
Michaels etal. (1997). 

Bolt (1976), Sykes and Davis (1987), Richards and Zavales (1990), 
and Lay (1992) discuss seismic verification of nuclear testing. More 
description of the Comprehensive Test Ban Treaty can be found at 
http ://pws. ctbto.org. 
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A very interesting example of sound waves in a solid, both longitudinal and transverse, are waves in the solid earth. Inside the earth, 
from time to time, there are earthquakes so sound waves travel around in the earth. Therefore if we place a seismograph at some loca¬ 
tion and watch the way the thing jiggles after there has been an earthquake somewhere else, we might get a jiggling, and a quieting 
down, and another jiggling... By using a large number of observations of many earthquakes at different places, we know what is 
inside the earth. 

Richard Feynman, The Feynman Lectures on Physics, 1963 


2.1 Introduction 

We begin the study of seismic waves in the earth by addressing 
two basic questions. First, what in the physics of the solid earth 
allows waves to propagate through it? Second, how does the 
propagation of seismic waves depend on the nature of the 
material within the earth? 

We will see that seismic waves propagate through the earth 
because the material within it, though solid, can undergo 
internal deformation. As a result, earthquakes and other dis¬ 
turbances generate seismic waves, which give information 
about both the source of the waves and the material they pass 
through. 

To motivate these ideas, we first discuss a stretched string, a 
simple physical system that gives rise to waves analogous to 
seismic waves in the earth. As for the solid earth, deforming the 
string causes displacements that are functions of space and time 
satisfying the wave equation. The velocity of the propagating 
waves depends on the physical properties of the string in a way 
similar to that for waves in the earth, and the waves respond to 
changes in the physical properties of the string in ways 
analogous to what occurs for waves in the earth. 

After discussing the string, we develop basic ideas about the 
mechanics of the solid earth. We introduce the stress tensor, 
which describes the forces acting within a deformable solid 
material, and the strain tensor, which describes the deforma¬ 
tion. We then explore the relation between these tensors, 
and show that the displacements within the material can be 
described as functions of position and time satisfying the wave 
equation. Specifically, we will see how two types of seismic 
waves, P and S, propagate. 


We then introduce concepts of wave propagation in the 
earth, with emphasis on how waves behave when they encoun¬ 
ter changes in physical properties. These ideas give us the tools 
for Chapter 3, which discusses how seismic waves are used 
to study the interior of the earth, and Chapter 4, where we dis¬ 
cuss how seismic waves are used to study earthquakes. 

Although we focus on seismic waves, many of the concepts 
are similar to ones for other types of waves, so we will some¬ 
times draw analogies to familar behavior of light, water, and 
sound waves. 

2.2 Waves on a string 

2.2.1 Theory 

We consider an idealized mathematical string that extends in 
the x direction. Initially the string is straight in response to 
a tension force T exerted along it, so u , the displacement from 
the equilibrium position in the y direction, is zero everywhere. 
After the string is plucked, portions of the string are displaced 
from their equilibrium positions and disturbances move along 
the string. 

Our goal is to describe the displacement u(x , t ) as a function 
of both position along the string and of time. To do this, we 
apply Newton’s second law of motion, F = ma, which states that 
the force vector equals the mass times the acceleration vector, 1 
to a segment dx of the string. Once the string segment is dis¬ 
placed, the string is stretched and the tension directed along the 

1 Bold face is commonly used to denote vectors; see Section A.3.1. 
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Fig. 2.2-1 Geometry of a segment of a string subject to a tension t. 
A slight difference in the angles and 0 2 provides a net force in the 
y direction of F = t sin 0 2 - f sin 6 V which accelerates the string. 


string gives rise to forces (Fig. 2.2-1) in the y direction of 
r sin 0 2 and -t sin 6 t at the ends of the segment. The net force in 
the y direction equals the inertial term, which is the accelera¬ 
tion (second time derivative of the displacement) times the 
mass, where the mass is the product of the density p and dx. 
Hence, the vector equation F = raa becomes the scalar equation 


F(x, t) = T sin Q 2 - t sin 0 t = pdx 


d 2 u(x , t) 
dt 2 


( 1 ) 


If the angles 0 are small, sin 6 ~ 6 ~ tan 0 can be approxim¬ 
ated by the slope, so 


^ du (x + dx, t) du {x, t) ^ 


dx 


dx 


pdx 


d 2 u (x, t) 
dt 2 


( 2 ) 


which can be expanded by forming a Taylor series and dis¬ 
carding the higher-order terms: 


du(x, t) d 2 u(x, t) ^ du{x, t) 


dx 


dx 2 


dx 


d 2 u(x, t) . 

= T- dx 

dx 2 

d 2 u(x, t) 


= pdx 


dt 2 


(3) 


yielding the wave equation : 
d 2 u(x, t) 1 d 2 u(x, t) 


dx 2 


dt 2 


(4) 


where v = (Tip) 112 . 

This equation gives the relationship between the time and 
space derivatives of the displacement u(x, t) along the string. 
We will see that the coupling between the two partial derivat¬ 
ives gives rise to waves propagating along the string with a 
velocity v. Because (4) describes the propagation of the scalar 




Distance 


Fig. 2.2-2 “Snapshots” of a string showing a pulse f(x - 2 1) traveling to 
the right in the +x direction. Because the velocity is 2, the pulse moves two 
distance units during each time unit. This pulse is one of many forms a 
traveling wave can take. 


quantity u(x, t) in one space dimension, it is called the one¬ 
dimensional scalar wave equation . 

The wave equation is easily solved, because any function 
with the form u(x, t) = f(x ± vt) is a solution. To show this, note 
that the partial derivatives are 

d d^lA = f’'( x ± vt ) and ^^- = v 2 f"(x + vt), ( 5 ) 

dx 2 dt 2 

where f" is the second derivative of f with respect to its argu¬ 
ment. Thus, although we often think of solutions to the wave 
equation as sines and cosines, any function whose argument is 
(x ± vt) is a solution. 

To see that a function f(x - vt) describes a propagating wave, 
consider how it varies in space and time. As time increases by 
an increment dt, the argument stays constant provided that the 
distance increases by vdt. Because the function’s value stays 
the same when its argument is constant, f(x - vt) describes a 
wave of constant shape propagating with velocity v in the 
positive x direction (Fig. 2.2-2). Similarly, because (x + vt) is 
constant if x decreases as time increases, f(x + vt) describes a 
wave propagating with velocity v in the -x direction. The sign 
relating the x and t terms thus shows which way the wave 
travels. We follow seismological convention and use the vector 
term “velocity” for v, although it is a scalar and thus better 
termed a “speed.” 

The velocity v — (Tip ) 112 at which the waves propagate 
depends on two physical properties of the string: the tension 
with which it is stretched and its density. Equation 1 shows 
how these properties interact. Because the tension provides 
the force that tends to restore any displacement to the equilib¬ 
rium position, greater tension gives higher acceleration and 
thus faster wave propagation. In contrast, because the density 
appears in the inertial term, higher density gives lower accelera¬ 
tion and slower wave propagation. 
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The fact that the velocity depends on the density illustrates 
one of the reasons why the string is a useful analogy for seismic 
waves in the earth. One goal of seismology is to study the com¬ 
position of the earth. For this purpose, we measure the time 
that waves take to travel between sources and receivers, find 
the velocity at which the waves propagated, and thus learn 
about the properties of the earth. 

2 . 2.2 Harmonic wave solution 

Any function of the form f(x ± vt) describes a propagating wave 
as a function of time and distance. A particularly useful form is 
a harmonic or sinusoidal wave 2 

u(x, t) = Ae ll<(0t±kx ^ = A cos (cot± kx) + Ai sin (cot± kx). (6) 

A harmonic wave is characterized by its amplitude A and two 
parameters, co and k , which we will discuss shortly. Substitut¬ 
ing into the wave equation (4) and canceling the exponential 
and constant show that the wave velocity is the ratio 

v— colk. ( 7 ) 

Although the exponential function u(x, t) in Eqn 6 is 
complex, the physical displacement must be real. We thus 
describe the displacement as the real part of u(x, t). The com¬ 
plex exponential form can be used for most purposes, because 
when a complex exponential appears in the solution of a 
physical problem, its conjugate also appears, so their sum 
yields a real displacement. 

To understand the harmonic wave solution, consider the wave 
given by the real part of u(x, if), which is A cos {cot - kx). Fig¬ 
ure 2.2-3 shows how this function varies with both distance 
and time. The value of u is constant when the phase ( cot - kx) 
remains constant, as for a crest or a trough. Such lines of con¬ 
stant phase require that x increases when t increases. These 
lines indicate waves propagating in the +x direction at a velo¬ 
city shown by dx/dt , the slope of the line in the x-t plane. 

Additional insight comes by examining u(x, t) at a point in 
space, x 0 . In terms of Fig. 2.2-3, this is a slice of the function on 
a plane parallel to the time axis, which intersects the distance 
axis at x 0 . This gives a periodic function of time, u(x 0 , t ) = 
A cos {cot — kx 0 ) (Fig. 2.2-4, top). Because the function returns 
to the same value when cot changes by 2k, the oscillation is 
characterized by the period , T = 2 k!qo, the time over which it 
repeats. The periodicity can also be described by the frequency, 
f= 1/T - go/(2k), the number of oscillations within a unit time, 
or by the angular frequency, co=2 k f. The period has the dimen¬ 
sions of time, so the frequency and angular frequency have 
dimensions of time” 1 . In Fig. 2.2-3, for example, u(x, t) = 
A cos (Kt - 2 kx), so the angular frequency is K (time units) -1 , 
the frequency is 1 h (time units) -1 , and the period is 2 time units. 


Properties of complex numbers are reviewed in Section A.2. 



0 


Fig. 2.2-3 Displacement as a function of position and time for the 
harmonic wave u(x, t) = A cos (tt-lnx) propagating in the direction. 
A line following a peak (or any part of the wave) in space and time 
represents the wave’s velocity. 




Fig. 2.2-4 A harmonic wave u(x, t) = A cos (cot - kx) shown at a fixed 
position as a function of time (top) and at a fixed time as a function of 
position (bottom). 

Thus the interval shown, 4 time units, includes two full cycles 
of the oscillation. Equivalently, 1 h a cycle occurs in a unit time. 

Alternatively, we can examine u(x, t) at a fixed time, t Q , 
and plot u(x, t 0 ) = A cos (cot 0 - kx) as a function of position 
(Fig. 2.2-4, bottom). In terms of Fig. 2.2-3, this is a slice of the 
function on a plane parallel to the distance axis, which inter¬ 
sects the time axis at t 0 . The displacement is periodic in space 
over a distance equal to the wavelength, X = 2Klk, the dis¬ 
tance between two corresponding points in a cycle. How 
the oscillation repeats in space can also be described by k, 
the wavenumber or spatial frequency, which is 2k times the 
number of cycles occurring in a unit distance. The wavelength 
has units of distance, so the wavenumber has dimensions of 


2 
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Table 2.2-1 Relationships between wave variables. 


Quantity 

Units 


Velocity 

distance/time 

v- (o/k = fA = A/T 

Period 

time 

T=2nlo)=Vf= A/v 

Angular frequency 

time -1 

0 ) = 2n/T=2nf=kv 

Frequency 

time -1 

f=(o/{2rf) = VT=v/A 

Wavelength 

distance 

A = 2n/k-v/f= vT 

Wavenumber 

distance -1 

k = 2n/A = o)/v=2nf/v 


distance -1 . In Fig. 2.2-3 the wavelength is 1 distance unit, four 
cycles occur in the 4-distance unit interval shown, and the 
wavenumber is 2k (distance units) -1 . Note that the wavelength 
and wavenumber are analogous, for constant time, to the 
period and angular frequency for constant x. 

Table 2.2-1 summarizes the relationships between the differ¬ 
ent wave parameters. All these relations can be derived from 
v = co/k and the definitions of the other quantities. Note the 
analogy between period and angular frequency, which describe 
the wave in time at a fixed point in space, and wavelength and 
wavenumber, which describe the wave in space at a fixed time. 
Although the different relations may seem confusing, they are 
easy to remember using the dimensions of the quantities. For 
example, velocity must be the ratio of wavelength to period, 
not their product. 

Thus Ae t ( cot±kx '> represents a wave field that is a function of 
both space and time. Often we hold one quantity fixed and 
observe the variation in the other. We can pick a point on a string 
and record a seismogram (“stringogram”) of the displacement 
as a function of time. By contrast, a “snapshot” picture of the 
waves on the string shows the displacement as a function of 
position, at a given time. These ideas apply to other wave 
phenomena, such as water waves incident on a beach. A life¬ 
guard, looking over the water at an instant of time, sees a wave 
field that varies in space, A swimmer, at a location in the water, 
encounters waves that vary in time. Both are observing, in 
different ways, a wave field that varies in both space and time. 
We will see that the same concept applies to seismic waves. 

The harmonic wave solution describes a sinusoidal wave of 
a particular frequency. This might seem to make it a specific 
solution, not applicable to more complicated propagating 
waves. In particular, the sinusoid is defined for all times and 
distances, whereas in physical situations we deal with waves 
that exist only for a limited span in space and duration in time. 
Fortunately, as we will discuss later, an arbitrary wave shape 
can be decomposed into a set of harmonic waves using Fourier 
analysis. As a result, solutions describing the simple case of 
harmonic waves can be applied to more complicated cases. 

2.2.3 Reflection and transmission 

So far, we have discussed waves traveling along a string of uni¬ 
form velocity. To use this as an analogy for the earth, within 
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Fig. 2.2-5 A wave pulse incident from the left on a junction between 
two strings of different properties gives rise to transmitted and reflected 
wave pulses. The fact that the reflected wave is inverted shows that the 
impedance is greater in the right string. Similarly, the fact that the 
transmitted pulse has a smaller length shows that the velocity is 
lower in the right string. 

which physical properties vary with depth, we need to treat 
waves on a string with variable properties along its length. 
The simplest situation is a string composed of segments with 
uniform properties. If the segments are long enough, we treat 
the displacement in each segment as composed of propagat¬ 
ing waves described by the solution for a uniform string with 
the appropriate properties, and then match solutions across the 
boundaries between segments. 

To illustrate this approach, consider a junction between 
strings of different properties (Fig. 2.2-5). The junction at 
x — 0 separates string segment 1 on the left with density p 1 and 
velocity v 1 from string segment 2 on the right (x > 0) with 
density p 2 and velocity v 2 . A wave arriving at the junction from 
the left yields two new waves. Some of the incident wave 
reflects from the junction, and thus travels to the left in string 
segment 1. The remainder of the incident wave is transmitted 
across the junction and travels to the right in string segment 2. 
We will show that the relative amounts of reflected and trans¬ 
mitted energy depend on the difference in properties across the 
interface. 3 

For the joined string segments, we write the total displace¬ 
ment in the left string segment as the sum of two harmonic 
waves 

u^x, t )=Ae i{m - klx) + Be^ m+klX l (8) 

The signs of the complex exponentials indicate that the incident 
wave, with amplitude A, travels in the +x direction, whereas 

3 The wave’s simultaneous reflection and transmission is analogous to shining a 
flashlight out of a window at night; you see the light reflected by the window, whereas 
someone outside sees the light transmitted through the window. 
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the reflected wave, with amplitude B, travels in the —x direc¬ 
tion. In the right-hand string segment there is only a trans¬ 
mitted wave going in the +x direction 

(9) 

The waves in the two string segments have different waven¬ 
umbers because of the different velocities in the two segments. 

The amplitudes of the reflected and transmitted waves are 
found using two boundary conditions that the physics of the 
string imposes on the solution at the junction x = 0. First, be¬ 
cause the two segments at the junction stay joined, the displace¬ 
ment must always be continuous across the junction, so 

u A { 0, t) = u 2 {0 , if), 

Ae im + Be icot = Ce io)t . (10) 

For this to occur at all times, the angular frequency of the three 
waves must be the same, as we have assumed, and the ampli¬ 
tudes must satisfy 

A + B = C. (11) 


The “12” subscripts indicate that the reflection and trans¬ 
mission coefficients describe a wave incident from segment 1 
upon segment 2; the corresponding coefficients for a wave 
incident from the right have subscripts “21.” These can be 
derived by interchanging the subscripts, showing that 

^12 = _ ^ 21 ’ ^12 + ^21 = 2 - ( 17 ) 

The reflection and transmission coefficients depend on the 
product of the density and velocity for each string, p^-, a 
quantity called the acoustic impedance. Because the amount 
reflected depends on the difference in impedances between the 
two sides, the strongest reflections occur at boundaries where 
properties change significantly. One limiting case is if the 
materials on both sides of the junction are identical (p t = p 2 and 
v \ ~ v t)-> ^e reflection coefficient is zero and the transmission 
coefficient would be one. Flence, as expected, all the wave is 
transmitted, and none reflects. The other limiting case, total 
reflection and no transmission, occurs at the end of a string. 
The fixed end of a string, where no displacement occurs, can be 
treated as a junction with a string of infinite impedance. Flence 
the reflection coefficient is 


Second, the y components of the tension forces acting on 
the two sides of the junction must always be equal, or the un¬ 
equal forces would tear the string apart. Thus, by analogy to 
Eqn 2, we have another boundary condition 

dudO, t) du 7 {0,t) 

T- i - = T---. 

dx dx 

Taking the derivatives and canceling terms gives 

Tk- [ {A-B)=tk 2 Cy 

or, because the velocities on the two sides are v i = ( r/p i ) 1/ 2 and 
fe,=a>/y„ 

p l v l [A-B) = p 1 v 1 C. (14) 

We now have two equations (11 and 14) for the three con¬ 
stants A, B, and C, giving the amplitudes of the incident, 
reflected, and transmitted waves. We can eliminate C and find 
the ratio of the amplitudes of the reflected and incident waves, 
known as the reflection coefficient , 

K v = 1 = P^i ZPiS i, ( 15 ) 

A Pl v l + Pl v 2 

Similarly, eliminating B yields the transmission coefficient , the 
ratio of transmitted and incident wave amplitudes, 


( 12 ) 


(13) 


C _ 2 p 1 v 1 

A P\ v l + Pl v 2 


(16) 


R =^1 

*^fixed 


= -l, 


Pl^l + 00 


(18) 


so the entire incident wave pulse reflects with the opposite 
polarity. Similarly, a string whose end is free to move is de¬ 
scribed by the condition that the derivative du/dx is zero, 
because there is no force applied. This can be treated as a junc¬ 
tion with a string of zero impedance, so the reflection coeffi¬ 
cient is +1, and the entire incident pulse reflects with the same 
polarity. For values between the limiting cases, Eqn 15 shows 
that the polarity of the reflection depends upon whether the 
wave leaves or enters a string of greater impedance. If the 
impedance of segment 2 exceeds that of segment 1, waves 
going from segment 1 toward segment 2 reflect with reversed 
polarity, whereas waves going the other way reflect with¬ 
out changing polarity. Reflections at free and fixed ends are 
extreme cases of this property. Flence the amplitudes of reflec¬ 
tions from boundaries can be used to infer changes in physical 
properties. 

To illustrate these ideas, consider the reflection and trans¬ 
mission of waves on a string divided at x = 10 into two 
segments (Fig. 2.2-6). The left segment has p 1 = 1, v x = 3, and 
the right segment has p 2 = 4, v 2 = 1.5. At time 0 the string 
is plucked for a very short time by a source at the position 
marked by the triangle, so waves spread out in either direction. 

At time 1, the first time shown, the wave traveling to the right 
has just encountered the junction (marked by a vertical dashed 
line). The reflection and transmission coefficients depend on the 
impedances p 1 v 1 = 3 and p 2 v 2 - 6. Thus for waves going from 
left to right R 12 = -0.33 and t 12 =o .67. A small reflected pulse 
is generated, with a downward polarity opposite that of the 
incident pulse, because the reflection coefficient is negative. At 





34 Basic Seismological Theory 


''vyvvw 



J_I_1_L 


0 5 10 15 20 

Distance 

Fig. 2.2-6 Wave propagation on a string composed of two segments 
of different properties: the left (segment 1) with p l = 1, zq = 3, and the 
right (segment 2) with p 2 = 4, v 2 = 1.5. The triangle marks the position of 
the source (distance 6.5) that plucked the string at time 0. The traces are 
successive snapshots of the string one time unit apart. The vertical dashed 
line indicates the position of the junction. Both ends of the string are fixed, 
so reflections there have unchanged amplitude but reversed polarity. 


time 2 we see this reflected wave traveling to the left and a 
larger transmitted wave traveling to the right. Note that, 
because of the different velocities, the reflected wave is further 
from the junction than the transmitted wave. 

At time 2 the original pulse traveling to the left has reached 
the left end of the string. What happens to it depends on the 
boundary condition at the end. Here, we assumed that the 
ends were fixed, so at time 3 the pulse is inverted and reflected. 
Similarly at time 5 the first reflection off the junction has been 
inverted at the left end and now travels to the right. 

When a pulse arrives at the junction, part is reflected and 
part is transmitted. For example, at time 6, the original pulse 
reflected from the left end has been converted at the junction 
into a transmitted wave with downward polarity and a re¬ 
flected wave with positive polarity. As time goes by, many 
pulses develop, each with an amplitude that is the product of its 
history. Thus, if the initial pulses had unit amplitude, the first 
reflection has amplitude R 12 . Once inverted by reflection off the 
fixed left end, this pulse has amplitude R n (—1). When it reaches 


the middle again (time 8), it gives rise to the small reflection 
with amplitude R 12 {-l)R 12 = -0.11 and a transmitted pulse 
with amplitude R 12 (-1)T 12 = 0.22. 

By time 14, the original pulse that traveled to the right has 
been transmitted to segment 2, inverted by reflection off the 
right boundary (time 8), and is now incident on the junction 
from the right. The reflection and transmission coefficients for 
a wave incident from segment 2 are R 21 = 0.33, T 21 = 1.33. 
Thus the reflected and transmitted pulses have the same down¬ 
ward polarity as the incident wave and amplitudes 
= -0.22 and T 12 (-1)T 21 =-0.89. 

It may seem curious that, because T 21 is greater than 1, 
waves transmitted to the left have larger amplitude than the 
incident wave that generated them. This effect, although not 
appealing intuitively, is possible so long as the energy in the 
transmitted wave does not exceed that in the incident wave. We 
will show later that this is the case. 

When a pulse is transmitted across the junction, its length 
as well as its amplitude changes. For example, the transmitted 
pulse at time 2 is shorter than the incident pulse. This results 
from the different velocities. To see this, recall that for a har¬ 
monic wave the angular frequencies of the transmitted and 
incident waves in the two strings are the same because the 
strings stay joined (Eqn 10). Thus 

co= v 1 k 1 = v 2 k 2 = v^ItzIX^ = v 2 l tt/A 2 , (19) 

so the wavelength is shorter in the slower string. Another way 
to see this is from the time needed for an incident pulse to be 
transmitted (Fig. 2.2-7). If the pulse in segment 1 has length 
it takes a time X 1 /v 1 to pass through the junction. The length 
of the transmitted pulse in segment 2 is the distance v 2 X 1 /v 1 
traveled by the leading edge of the transmitted pulse when the 
trailing edge of the incident pulse reaches the boundary. 

A point worth noting is that the displacement at a point 
on the string is the sum of the displacements of all the waves 
passing by that point. For example, at time 10 (in Fig. 2,2-6) 
two waves, one traveling in either direction, add up to give 
a large pulse. At the next time step, the two waves have 
separated. Thus a wave has no lasting effect after crossing 
another; the waves “go through” each other. The concept that 
the waves can be added up without affecting each other is 
called linear superposition. This is generally assumed to be 
valid unless the amplitudes of the waves are so large that the 
material behaves nonlinearly , or differently from the simple 
elastic assumptions used to derive the propagating wave equa¬ 
tion. Superposition allows us to form waves of arbitrary shape 
from harmonic waves of different frequencies using a Fourier 
series, as was done to form the pulses in this example. This 
posed no difficulty because in our derivation neither the velo¬ 
city nor the reflection and transmission coefficients depended 
on frequency. 

The fact that the amplitudes of waves on a string change as 
they are reflected and transmitted at interfaces where the prop¬ 
erties of the string change illustrates a concept important for 
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Fig. 2.2-7 An incident wave pulse of length X x on a string with velocity v x 
generates a transmitted pulse of length ^ in a string with velocity v 2 . 

The change in pulse length results from the distance the transmitted 
pulse travels while the incident pulse passes through the junction. If the 
amplitude of the incident pulse is 1, then the reflected and transmitted 
pulses have amplitudes R 12 and T 12 . 


seismic waves in the earth. We will use this approach to show 
how we study changes in physical properties at depth in the 
earth from the amplitudes of reflected and transmitted waves. 


2.2 .4 Energy in a harmonic wave 

We noted earlier that in some cases the transmission coeffici¬ 
ent exceeds 1. To see how this occurs, we consider the energy 
transported by the traveling waves. It turns out that although 
amplitudes are easier to visualize, energy is often more useful 
for understanding wave behavior because energy is conserved, 
whereas amplitude is not. Hence, when a result for amplitudes 
is hard to understand, considering the energy can provide 
insight. 

By analogy to the kinetic energy mv 2 !2 of a point mass, the 
kinetic energy, KE, of a segment dx of the string is found from 
the velocity, the time derivative of the displacement, so 


KE = 


d u 


dx. 


( 20 ) 


because the mass of the string is m = pdx. 

The string also stores potential energy, because it is stretched, 
or deformed, from its equilibrium position. We will see shortly 
that a measure of the deformation is the strain, e , which for 
the string is the ratio of the change in the length to the original 
length. Hence for an element of the string (Fig. 2.2-1) with 
initial length dx, the strain due to the displacement du is 


e = 


{dx 1 + du 1 ) 111 - dx 
dx 


1 + 


'iu* 


-an 


dx 


1 = 


\ dx j 


, ( 21 ) 


where the last step used the Taylor series (1 + a 1 ) 111 ~ 1 + a 1 !2 
for small a. The potential energy stored in the string is the prod¬ 
uct of the tension and the strain integrated over the entire 
length L, 


L 


fj 

0 


r 


v 


du 

dx 


-1 dx, 


( 22 ) 


so we can define the average potential energy, PE, in a segment 
dx as 


PE = - 


du 

dx 


dx. 


(23) 


We characterize the energy of a traveling wave by the kinetic 
and potential energy averaged over a wavelength. If u{x, t) 
= A cos (cot - kx), then the kinetic energy averaged over a 
wavelength is 


KE = 



V 2 

K dt, 


dx = 


x 


pA 2 co 2 

22 , 


sin 2 {cot - kx)dx. 


The integral of the sinusoid squared over a period is 


x 

sin 2 {cot - kx)dx = 2/2, 
o 

so the kinetic energy is 


(24) 


(25) 


KE = A 2 co 2 pl4. 


(26) 


Similarly, the potential energy averaged over a wavelength is 


PE 


2X 


du 

dx 


dx 


tA 2 k 2 

22 


sin 2 {cot-kx)dx. 


(27) 


which, using Eqn 25, becomes 


PE = rA 2 k 2 /4 = A 2 co 2 p/4, 


(28) 


the same as the kinetic energy. 

Hence the total energy transported, averaged over a wave¬ 
length, is the sum of the potential and kinetic energies: 

E = PE + KE = A 2 co 2 pH. (29) 

Another way to state this is in terms of the energy flux, the rate 
at which the wave transports energy past a point on the string. 
The average flux is just the averaged energy times the velocity 





E = A 1 co 1 pvl2. 


(30) 

For a string of a given density, the energy flux is proportional 
to the amplitude and angular frequency squared, so higher- 
frequency waves transport more energy. 

Consideration of the energy explains how in Fig. 2.2-6 the 
transmitted wave can have higher amplitude than the incident 
wave. To see that an incident wave converting into reflected 
and transmitted waves conserves energy, assume that a wave 
in segment 1, described by cos (cot - k jx), is incident on the 
junction. It gives rise to a reflected wave in segment 1, described 
by R 12 cos {cot + k t x), and a transmitted wave in segment 2, 
described by T 12 cos (cot - k 2 x). Using Eqns 15 and 16 for R 12 
and T 12 , the net energy flux for the reflected and transmitted 
waves is the sum 

E^ + E T = Rj 2 co 2 p 1 v 1 /2 + T l 2 o) 2 p 2 v 2 /l 
— (® 2 /2)[^i2TiPi + T 2 2 v 2 p 2 ] 

= co 1 p 1 v 1 /2 = E 1 , (31) 

which equals the energy flux in the incident wave. Thus, even 
if the amplitude of the transmitted wave exceeds that of the 
incident wave, the energy of the transmitted wave is less than 
that of the incident wave. 4 

2.2.5 Normal modes of a string 

So far, we have discussed waves propagating along a string. 
Additional insight into propagating waves can be gained by 
considering standing waves, which are known as the normal 
modes , or free oscillations , of the string. 

Recall that we began by applying Newton’s second law to a 
string, and found that the displacement u(x , t) as a function of 
position and time satisfied the scalar wave equation 

d 2 u(x,t)_ 1 d z u(x,t) 
dx 2 v 2 dt 2 

We saw that this equation had solutions like 

u(x,t) = A cos (cot±kx), (32) 

which describes harmonic waves with angular frequency co 
and wavenumber k = 2n!X y propagating at velocity v such that 
v-colk . 

An alternative approach is to seek solutions of (4) with a 
cos (cot) time dependence, such that 

u(x, t) - U(x, co) cos (cot), (33) 


by substituting this form into the wave equation (Eqn 4). 5 
Taking the derivatives and canceling the common factor yields 

^A> = _^ U(X)to) . (34) 

dx 2 v 2 

One solution of this equation is 

U(x, co) = sin (cox/v). (35) 

If the string has fixed ends at x = 0 and x = L, then Eqn 35 must 
satisfy the boundary conditions 

U(0, co) = U(L, co) = 0. (36) 

The solution already satisfies the boundary condition at x = 0, 
so all that is needed is to satisfy the boundary condition at 
x = L, 

U(L, ( 0 ) = sin (coL/v) = 0, (37) 

which occurs for angular frequencies co n such that 

co n L/v = nrc or co n = n7tv/L . (38) 

Thus the zero displacement boundary conditions at the string’s 
ends require that it vibrate only at specific frequencies, called 
eigenfrequencies. The eigenfrequencies each correspond to a 
solution 

U n (x, co n ) cos (co n t), (39) 

where the spatial term 

U n (x, co n ) = sin (co n xiv) - sin (nnxIL) (40) 

is known as the spatial eigenfunction. 

To interpret these solutions physically, note that co - vk 
= v2k!X) so the eigenfrequencies correspond to 

co n -njtvlL = 2nvfX or L = nXI2. (41) 

Thus each spatial eigenfunction has an integral number of half 
wavelengths along the string’s length L, so the displacement 
at both ends is zero. The solutions are standing waves, known 
as the normal modes, or free oscillations, of the string, each of 
which has a characteristic spatial eigenfunction and vibrates at 
a characteristic eigenfrequency. Because the string is finite, it 
can vibrate only in these discrete modes that satisfy the bound¬ 
ary conditions. The eigenfrequencies are spaced tzvIL apart, so 


4 An analogous phenomenon occurs at beaches, where waves increase in amplitude 
as they approach the shore because the wave speed is proportional to the square root 
of water depth. 


5 This procedure amounts to taking the Fourier transform of the equation in 
frequency, and then using a Fourier series in space. Fourier analysis is discussed in 
chapter 6. 
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a 


the longer the string is (i.e., the larger L gets), the closer the 
eigenfrequencies become. 

A traveling wave can be expressed as the weighted sum of the 
string’s normal modes, so it is the sum of the eigenfunctions, 
each weighted by the amplitude A n and vibrating at its eigen- 
frequency co n , 

oo 

«(*, t) = X A n U n( X ’ ®») C0S («„*)• < 42 > 

n =0 

An important feature of this solution is that the modes are 
rthogonal , meaning that the integral over the string of the 
roduct of two different eigenfunctions is zero, 


mnx . nnx . 

sm - sin - ax 

L L 


where 8 mn , the Kronecker delta symbol defined in Eqn A.3.37, 
is zero unless m = n. Each mode is independent and cannot be 
constructed by combining other modes. Thus we can think 
of the displacement of the string as a vector in a vector space 
(Section A.3.6) whose basis vectors are the eigenfunctions. Any 
particular set of waves is given by the amplitudes A n , which are 
the weighting factors of the eigenfunctions or the components 
of the basis vectors. 

The amplitude for each eigenfunction depends on the posi¬ 
tion of the source that generated the waves and on the behavior 
of the source as a function of time. The spatial part of A n has 
the same form as U n (Eqn 40), so 

A H = sm(nnx s /L)F(G> n ), (44) 

where is the position of the source, and F{co n ) is a weighting 
factor describing how different frequencies contribute to the 
time history of the source. Thus the normal mode expression 
for the displacement (Eqn 42) can be written 


u(x 9 f) = sin ( nnxJL)F{(O n ) sin ( nnxth ) cos (co n t). (45) 
«=o 

Figure 2.2-8, computed in this way, illustrates how the first 
40 modes of a string with fixed ends and a uniform velocity 
combine to give traveling waves. The source, at x s - 8, is 
described by 

F(ffl„) = exp [—(<w„t) 2 /4] (46) 

with x — 0.2. The computer program used is similar to that 
discussed in Section A.8.1. The mode sum shows two waves, 
one propagating to the right and one propagating to the left, 
at the expected positions. Hence the mode sum correctly gives 
the propagating waves. In addition to the propagating waves, 
we see some small oscillations along the string because only the 
first 40 modes were summed. 
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Fig. 2.2-8 Displacement of a string with fixed ends computed using 
the normal mode formulation. The string has length 20, velocity 3, 
and was plucked at time 0 by a source at position 8 (triangle). The bottom 
trace shows the displacement of the string at time 1.5, computed by 
summing the first 40 modes. The mode sum generates both the right- 
and the left-propagating waves at the appropriate positions. Spatial 
eigenfunctions for the individual modes, each of which corresponds to 
an integral number of half wavelengths, are also shown above the sum. 
The traces are normalized to unit amplitude. 


We now have two ways to think of the displacement of the 
string as a function of time: either as propagating waves or as 
normal modes. Neither is more “real” — both are ways of rep¬ 
resenting how the displacement evolves. Thus comparing the 
two gives interesting insights. For example, consider studying 
the properties of the string. In the traveling wave formulation, 
we measure travel times and thus infer velocity. In the normal 
mode formulation, we measure eigenfrequencies and then infer 
velocities. Thus the eigenfrequencies are analogous to the travel 
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The normal mode solution (Eqn 45) gives insight into the 
relation between the medium in which waves propagate and 
the source that generates them. The waves are expressed as 
the sum of eigenfunctions weighted by amplitudes that depend 
on the source. The physical properties of the string control its 
velocity and thus its eigenfrequencies and spatial eigenfunctions. 
The displacement due to any particular source corresponds to a 
different weighting of the eigenfunctions. By analogy, we use 
the eigenfrequencies of the earth’s normal modes to study the 
properties of the medium (earth structure), and the displace¬ 
ment (the specific weighting of eigenfunctions) to study the 
source (generally an earthquake) that excited them. 

The normal mode solution generates all the incident, reflected, 
and transmitted waves, although they do not appear individu¬ 
ally as they do in a traveling wave solution like Eqn 8. The 
mode solution is thus less intuitive, and individual modes are 
not physically meaningful, although their sum is. For example, 
each mode mathematically starts vibrating along the entire 
string at time zero, even though no waves have reached the 
string ends. When the modes are summed, the resulting waves 
propagate at the correct velocity. 

The solution also illustrates an important relation between 
the positions of the source and the receiver. The fact that 
Eqn 45 depends in the same way on the positions of the source 
(x ) and the receiver (x) illustrates the principle of reciprocity , 
which states that under appropriate conditions the same dis¬ 
placement occurs if the positions of the source and the receiver 
are interchanged. This principle is important for studying earth 
structure because it is often convenient to place the source or 
the receiver at a particular site. We can do this knowing that 
the same ray paths and thus waves result. 6 Equation 45 also 
illustrates an important point about the relation of the source 
position to the waves generated: namely, a source at a point 
where a particular mode has no displacement will not excite 
that mode. For example, in Fig. 2.2-8, modes with numbers 
that are multiples of five give zero displacement because the 
source term sin (hkxJIO) is zero. Analogously, in the earth, 
surface waves whose displacements are largest near the surface 
are not excited well by deep earthquakes. 

Finally, although we have discussed the normal modes of 
a uniform string, we could generalize these ideas to find the 
modes of a non-uniform string. One way to do this is to extend 
the method used to find the reflection and transmission coef¬ 
ficients (Section 2.2.3). We treat the string as a set of uniform 
pieces, use the harmonic wave solution in each piece, and 
impose displacement and traction boundary conditions at the 
junctions. We then numerically find eigenfrequencies that 
satisfy the fixed boundary condition at the string’s end. The 
normal modes of the non-uniform string are then summed to 
give the traveling waves. The waves on the non-uniform string 
in Fig. 2.2-6 were calculated in this way. 


6 A familiar version for light waves, seen on the back of large trucks, warns other 
drivers that “If you can’t see my mirrors, I can’t see you.” 
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2.3.1 Introduction 

By applying Newton’s second law of motion, F = ma, to a string, 
we found that deforming the string gave rise to propagating 
waves. Similarly, deforming the solid earth produces seismic 
waves. We study these waves using concepts from continuum 
mechanics , which describes the behavior of a continuous 
deformable material made up of particles packed so closely 
together that density, force, and displacement can be thought 
of as continuous and differentiable functions. This approxima¬ 
tion breaks down on an atomic distance scale, but is adequate 
for most seismological problems. 

For these applications, we write Newton’s second law in 
terms of the force per unit volume and the density, the mass per 
unit volume. If the density does not change with time, the force 
per unit volume f(x, t) equals the inertial term, the product 
of the density p and the second derivative of the displacement 
vector u(x, t) with time. Thus F = ma becomes 


f(x, t) = p 


3 2 u(x, t) 
dt 2 


( 1 ) 


This vector equation can be written as a set of three equations, 
one for each component of the force and displacement vectors 1 


/i(x, t) = p 


d 2 ^-(x, t) 
dt 2 


( 2 ) 


In seismic wave propagation, both the displacement and the 
force vectors can vary in space and time. Although this depend¬ 
ence is generally not written explicitly, we will sometimes do 
so to remind ourselves that the solutions depend on space and 
time. 

The goal of this section is to use Newton’s second law to 
characterize a continuous medium and its response to applied 
forces. We first introduce the stress tensor that describes the 
forces acting on a deformable continuous medium. We then 
formulate the equation of motion, the version of Newton’s law 
appropriate for a continuous medium, which relates the stress 
to the displacement. The variation in displacement within the 
material, described by the strain tensor , gives rise to internal 
deformation. This deformation is related to the stress via the 
constitutive equation that characterizes the properties of the 
material. Our brief discussion covers some basic results of 
continuum mechanics necessary for introductory seismology. 
The suggested reading listed at the end of the chapter provides 
further treatment of these and related topics. 


1 The three equations are written as one using index notation (Section A.3.5) in 
which the index i ranges from 1 to 3 over the coordinate axes. Index notation makes 
cumbersome vector equations shorter, clearer, and often easier to solve. These equa¬ 
tions are often made even more compact using a dot superscript to indicate differen¬ 
tiation with respect to time, so the acceleration is 
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Fig. 2.3-1 Surface force on a volume element V within a material. The 
surface force F due to the material outside V acts on each element of 
surface dS, which has an outward-pointing unit normal vector n. 


2.3.2 Stress 

Two types of forces can act on an object. The first is a body 
force , which acts everywhere within an object, resulting in a 
net force proportional to the volume of the object. A familiar 
example is the body force g due to gravity; the net force on an 
infinitesimal body with density p and volume dV is pgdV. The 
units of a body force are force per unit volume. 

A second type of force is a surface force , which acts on the 
surface of an object, yielding a net force proportional to the 
surface area of the object. For example, an object in a pool 
of fluid is subject to a pressure equal to the weight (a force) 
per unit area of the fluid above the object. At any point on the 
object’s surface, the pressure is directed along the normal to 
the surface. Thus a surface force like pressure acts in different 
directions on different parts of an object, in contrast to gravity, 
which is a body force that always points down. Surface forces 
have units of force per unit area. 

We now consider the forces acting on a small volume V, with 
surface S , within a larger continuous medium (Fig. 2.3-1). The 
material inside V is affected by body forces acting on every¬ 
thing inside V and surface forces, due to the material outside, 
acting on the surface S. If the surface force F acts on each ele¬ 
ment of surface dS, whose outward unit normal vector is n, we 
define the traction vector, T, as the limit of the surface force per 
unit area at any point as the area becomes infinitesimal: 


T(n) = lim — 
is—> o dS 


( 3 ) 


The traction vector has the same orientation as the force, and is 
a function of the unit normal vector n because it depends on the 
orientation of the surface. 

The system of surface forces acting on a volume is described 
by three traction vectors. Each acts on a surface perpendicular 
to a coordinate axis (Fig. 2.3-2), and is thus parallel to the 



Fig. 2.3-2 Traction vectors acting on three faces of a volume element 
which are perpendicular to the coordinate axes. The superscript on 
T indicates the direction of the normal to the face on which T acts. 
The three components T)- 2 ) are shown. 


plane defined by the other two axes. We define Tb) as the trac¬ 
tion vector acting on the surface whose outward normal is in 
the positive e- direction. The components of the three traction 
vectors are Tf, where the upper index (/) indicates the surface 
and the lower (i) index indicates the component. For example, 
is the x 3 component of the traction on the surface whose 
normal is Sj. 

This set of nine terms that describes the surface forces can be 
grouped into the stress tensor , G- { . The tensor’s rows are the 
three traction vectors, such that 


<*11 

°12 

\ 

°13 

" T ( l )> 

Tf 

Tf 

Tf 

°21 

°22 

°23 = 

_ J ( 2 ) „ 

= Tf 

Tf 

X ( 2 ) 

a 31 

°32 

<^ 33 ) 

T < 3 ) 

\ / 

T f 

Tf 

T ( 3 ) 


(4) 


Thus the stress component is the ith component of the trac¬ 
tion vector acting on the surface whose outward normal points 
in the e- direction. The stress gives the force per unit area that 
the material on the outside (the side to which n points) of the 
surface exerts on the material inside. In the special geometry of 
Fig. 2.3-2, where the surfaces are along coordinate axes, it is 
easy to see that cr - = T jA 

In some applications, it is more convenient to write the co¬ 
ordinate axes as x, y, and z, so the stress tensor is written 



( 5 ) 


The stress tensor gives the traction vector T acting on any 
surface within the medium. To illustrate this, we examine the 
traction on an arbitrary element of surface dS, whose normal n 
is not along a coordinate axis. Consider the material inside an 
infinitesimal tetrahedron of volume dV formed by this surface 
and three other faces, each perpendicular to a coordinate axis, 
with normal in the -e- direction (Fig. 2.3-3). The area of the 
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Fig. 2.3-3 Stress components on three faces of a tetrahedron, with 
normals parallel to coordinate axes. Summing the resulting forces 
yields the net force on the fourth (slanted) side. 


face with its normal in the ~e ■ direction is given by using the 
scalar product to find the cosine of the angle between n and e ; , 

(n* Q-)dS = n-dS. (6) 

Because traction is force per unit area, the net surface force in 
a given direction is found by multiplying each component of 
the traction by the area of the face it acts on and summing over 
the faces. Thus the total force in the e f direction is that due to 
this component of the traction, those resulting from the stress 
on the other three faces, and the component of the body force 
f in this direction. This total force equals the mass pdV of 
the tetrahedron times the component of acceleration in the in¬ 
direction, 

3 d 2 u 

T t dS -J^G^dS + f { dV = P~-fdV. (7) 

/=l 

Dividing by the area and letting dV/dS go to zero, we see that 
the stress tensor is related to the traction and normal vectors by 


T i= 

;=1 


( 8 ) 


where the last form uses the index notation convention that a 
repeated index indicates summation (Section A.3.5). Because 
this equation gives the traction on an arbitrary surface, the 
stress tensor describes the surface forces acting on any volume 
within the material. 




Fig. 2.3-4 The sense of positive stress components for a volume with faces 
perpendicular to the coordinate axes. (T ; -- is the stress component acting in 
the e ; - direction on the face with outward normal in the e ; - direction. 


The sign convention for stress components comes from the 
relation between the outward normal and the basis vectors. 
Figure 2.3-4 shows the positive stress components acting on a 
cube of material with faces perpendicular to the coordinate axes. 
For example, on the face with outward normal e 3 = (0,0,1), <J 33 
is positive in the e 3 direction, and <J 31 is positive in the e x direc¬ 
tion. Because the tractions are T- = a 3i , positive <t 33 and o 31 
yield forces in the and x 1 directions. By contrast, on the 
opposite face with outward normal ~e 3 = (0,0, -1), <7 33 is posit¬ 
ive in the -x 3 direction, and cr 31 is positive in the -x 1 direction. 
Thus the tractions are T t = -o 3i , and positive tJ 33 and <7 31 yield 
forces in the -x 3 and -x x directions. 

The three diagonal components of the stress tensor, a lv o 22 , 
and (J 33 , are known as normal stresses , and the six off-diagonal 
components are called shear stresses. The corresponding com¬ 
ponents of the traction vector are called normal and shear 
tractions. Figure 2.3-4 shows that positive normal stresses 
tend to expand the volume, whereas negative normal stresses 
make the volume smaller. Thus positive values of the normal 
tractions correspond to tension , whereas negative normal trac¬ 
tions correspond to compression . At most points within the 
earth, because material is under compression from the weight 
of rock above, the normal stress components are negative. 
Geophysicists thus often speak of the “maximum compressive 
stress,” the most negative and largest in absolute value, and the 
“minimum compressive stress,” the least negative and smallest 
in absolute value. 

An important property of a stress tensor is that it is 
symmetric , 


To show this, consider the torque (Eqn A.3.32) t 3 about the x 3 
axis on a rectangle of material with sides dx v dx 2 , along the 
coordinate axes (Fig. 2.3-5). If the torque is zero, the angular 
momentum of the block remains constant, so the block will not 
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ents, only the three normal ones and three of the six shear ones 
are independent. 

Because the stress tensor is symmetric, we usually write (8) as 


d<J 2 2 . 

On + —— C / X , 


d0 2 i t 

a 21 + ^dx 2 


T- = ^ = <%«-, 


or, in terms of the vectors rather than their components, 


dcT 1 2 , 


do n , 

a, 1+ —dx, 


Fig. 2.3-5 Clockwise and counterclockwise torques about the x 3 axis on a 
rectangle due to the stress components and body forces. If the stress tensor 
were not symmetric, u 12 = <J 21 , a net torque would arise. 


start to rotate if it is not already doing so. The net body force, if 
any, is f i dx 1 dx 2 , where is the force at the center of the block. 
Because a torque is the product of a force and a lever (or 
moment) arm, the shear stresses a 21 and o 12 acting on the faces 
along the x x and x 2 axes contribute no torque. The other stress 
components cause torques equal to the product of the lever arm 
and the traction, the stress component times the area of the 
face. Thus the total counterclockwise torque is the sum of that 
due to the shear tractions on the other two faces, with lever 
arms dx 1 and dx 2 , the normal tractions on all four faces, with 
lever arms dx^l and dxj 2, and the two body force com¬ 
ponents acting at the center of the block, with lever arms dx j/2 
and dx 2 ! 2: 

t 3 = <j 12 + —1? dx 1 dx 2 - <j 21 + ■ dx 2 1 dx x dx 2 

\ dx l) l dx 2 J 


dan , ] . dx 7 , dx 7 

- a n -\ - Li -dx 1 dx 2 —- + o n dx 2 —- 

dx t J 2 2 


3o" 22 j j dx-i j dx -j 

°22 4 -“ UX 2 dx t —- - <7 22 dx 1 — L 


+ f 2 dx 1 dx 2 ^~ - f 1 dx 1 dx 2 ^^- 


T = oh. (12) 

Stress has units of force per area. In the cgs system of units 
based on the centimeter, gram, and second, force is given 
in dynes (dyn), with 1 dyn = 1 g-cm/s 2 , so stress is given in 
dyn/cm 2 , or bars , a unit equal to 10 6 dyn/cm 2 . The bar has the 
convenient property that atmospheric pressure at sea level is 
1.01 bars. In SI units based on the meter, kilogram, and second 
(mks), force is given in Newtons (N), with 1 N = 1 kg-m/s 2 , so 
stress is given in Pascals (Pa), a unit equal to 1 N/m 2 . The two 
sets of units can be related by noting that 1 Pa = 10 5 dyn/ 
10 4 cm 2 =10 dyn/cm 2 = 10~ 5 bars, so 1 MPa equals 10 bars. 

233 Stress as a tensor 

We have been using the term “tensor” without defining it. Al¬ 
ready, we saw that it came from a relation between the traction 
and normal vectors, and is an entity with two subscripts that 
has properties similar to those of vectors. Vectors are entities 
that are independent of coordinate system, so that physical 
laws written using them do not depend on the coordinate 
system and can be analyzed using any convenient coordinate 
system. We now show that tensors are similar entities. 

Specifically, a vector is an entity that remains the same in two 
coordinate systems (Section A.5.1), such that its components 
in two different Cartesian coordinate systems are related by the 
transformation matrix A. Hence, given two sets of axes {x v x 2 , 
x 3 ) and (x' v x 2 , x 3 ), the components of a vector u are related by 

u' = Au. (13) 

The relation between the components of the stress tensor 
in two Cartesian coordinate systems can be found using the 
fact that it relates the traction and normal vectors in each 
coordinate system. The components of the traction and normal 
vectors in the two coordinate systems satisfy 

T' = AT, n' = An. (14) 

The reverse transformation can be written using the inverse of 
A which, because A is orthogonal, equals its transpose: 


Dividing by the area and letting dx x and dx 2 go to zero, we see 
that for there to be no torque, o 12 = cr 21 . The same argument for 
the torque about the other two axes shows that <j 13 = <r 31 and 
°23 = °32- Thus, although the stress tensor has nine compon- 


n = A -:l n' = A T n'. (15) 

In the primed coordinate system, the traction is related to the 
normal vector and the stress tensor by 
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T' = cr'fi', (16) 

so, by Eqns 14 and 15, 

V = AT = A cm = AoA T n / . (17) 

Comparison of Eqn 16 and the last term in Eqn 17 shows that 
cr / = AoA T . (18) 


Now, consider the stress acting on a smaller block, with faces 
of a different orientation, within the larger one. To find the 
tractions on the second block’s sides, we define a second 
coordinate system in which the x[ and x 2 axes are normal to the 
faces and rotated by 0 with respect to the x 1 and x 2 axes, 
whereas the x 3 and x 3 axes coincide. Although the stress is 
the same in both blocks, the components of the stress tensor 
expressed in the two coordinate systems differ. The relation 
between the components is given by 


This equation defines a tensor in Cartesian coordinates. Recall 
that what makes a vector more than a set of three numbers is its 
transformation properties: the numerical values of the com¬ 
ponents that describe it transform between coordinate systems 
in a way that preserves the vector as an entity independent of 
coordinate system. Similarly, a matrix of numbers is a tensor 
only if it transforms between coordinate systems according 
to Eqn 18. We derived this transformation by assuming that a 
tensor, in this case stress, is an operator relating two vectors, in 
this case the normal and traction, in a specific way regardless of 
coordinate system. The tensor’s components transform between 
coordinate systems, so the tensor as an entity does not change. 
Because one application of the transformation matrix trans¬ 
forms a vector, two applications transform a tensor that relates 
two vectors. Unfortunately, tensors are harder to visualize than 
vectors. Although the stress tensor may seem puzzling, it is one 
of the easier tensors to interpret physically. 

To illustrate these ideas, we consider an example of how a 
stress tensor’s components change between coordinate systems. 
Assume that a block of material, with faces perpendicular to 
the x 1 and x 2 axes, is subject only to normal stresses o x and o 2 
(Fig. 2.3-6), so the stress tensor is diagonal, 
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a 1 cos 2 0 + o 2 sin 2 0 {c> 2 - cq) sin 6 cos 0 0 

(<7 2 - cqjsin 0 cos 0 a 1 sin 2 0 + o 2 cos 2 0 0 

, 0 0 oj 


( 20 ) 


For example, if cq = 1, a 2 = -l, and 0=45 °, 


a' 


' 0 -1 o' 

-10 0 

0 0 0 

V / 


( 21 ) 


Thus, although the large block is oriented such that the stress 
tensor causes only normal tractions, giving compression along 
the x 2 axis and tension along the x 1 axis, only shear tractions 
act on the smaller block because its sides are oriented differ¬ 
ently. The negative shear stress values yield tractions in the -x' 2 
direction on the face with normal e 3 , and in the x 2 direction on 
the opposite face with normal -ej, consistent with what we ex¬ 
pect from the normal tractions on the larger block. Although the 
components of the stress tensor in the two coordinate systems 
differ, they represent the same entity, the physical state of stress. 




*2 






Fig. 2.3-6 An example of the stress tensor’s different components in 
different coordinate systems. In the x v x 2 axis coordinate system, the 
stress tensor is diagonal. In contrast, shear stresses act on a volume with 
faces normal to the x\ and x' 2 coordinate axes, which are rotated by 6 
with respect to the x v x 2 axes. 


2.3 A Principal stresses 

For a given state of stress, the traction vector acting on most 
surfaces within a material has components both normal to the 
surface and tangential to it. There are, however, some surfaces 
oriented such that the shear tractions on them vanish. These 
surfaces can be characterized by their normal vectors, called 
principal stress axes ; the normal stresses on these surfaces are 
called principal stresses . The concept of principal stress axes is 
important for discussion of earthquake source mechanisms 
(Section 4.2). 

To find the principal stresses, we use the concepts of 
eigenvalues and eigenvectors (Section A.5.2). The shear com¬ 
ponents of the traction will be zero if the traction and normal 
vectors are parallel, such that they differ only by a multiplicat¬ 
ive constant, A, 

T{ = &ij n j - ( 22 ) 
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Thus the principal stress axes h are the eigenvectors of the stress 
tensor, and the principal stresses A associated with each one are 
the eigenvalues. The eigenvalues and eigenvectors can be found 
hV solving the system of homogeneous linear equations 


Defining the diagonal matrix containing the eigenvalues as A, 




a l2 

a l3 

n l 

0 


"22 “ ^ 

°23 

n 2 

= 0 , 

(23) 

<7 32 

°33 - ^ j 

*3, 

0 

V / 



where the Kronecker delta symbol 8- = 0 except when i = /, 
in which case it equals 1 (Eqn A.3.37). A nontrivial solution 
exists only for values of X such that the matrix is singular (has 
no inverse), which occurs when its determinant is zero (Section 
A.4.3), 


G n ~ X 

°12 

a i3 

det <J 2 i 

<T 22 X 

°23 

<7 3 i 

°32 

(T 33 X 


Multiplying out the determinant gives the characteristic 
polynomial 

A 3 -/ 1 A 2 + J 2 A-I 3 = 0, (25) 

whose coefficients, the invariants of the stress tensor, are 
independent of the coordinate system. In particular, I t is the 
trace , or sum of the diagonal elements, which has physical 
significance, as discussed in Section 2.3.6. 

The roots X of Eqn 25 are the eigenvalues or principal 
stresses, denoted cr m , which are often ordered by decreasing 
value o 1 > <J 2 > <J 3 . In geology, where all stresses are com¬ 
pressive (negative), we usually order the principal stresses by 
magnitude, so | a 1 | > | cj 2 | > | (7 3 |. Each eigenvalue is then sub¬ 
stituted into Eqn 23 to find the components of the associated 
eigenvector Because the stress tensor is symmetric, the 
three eigenvectors are automatically orthogonal if the roots are 
distinct (Section A.5.3), so there are three mutually perpendi¬ 
cular surfaces on which there is no tangential traction. Even 
if there are multiple roots, it is still always possible to find 
orthogonal n (m) . 

The principal stress axes are perpendicular and can be used 
as basis vectors for a useful coordinate system in which the 
stress tensor is diagonal. To transform vectors into this new 
coordinate system, we use a rotation matrix (Section A.5.1) 
whose rows are the components of the basis vectors of the new 
coordinate system written in the old coordinate system. In this 


case the 

rows are 

the 

eigenvectors, 

and the transformation 

matrix is 
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we can describe all the eigenvalue-eigenvector pairs by writing 
Eqn 22 as a matrix equation, 

<jA t =A t A (28) 
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Carrying out the tensor transformation (Eqn 18) shows that 
the stress tensor in the new coordinate system is now diagonal, 

o' = AoA T = A, of of, (29) 

where summation over i is not implied. To see why the stress 
tensor is diagonal, recall that each row of the stress tensor 
contains the components of the traction vector acting on a 
plane perpendicular to a coordinate axis. The new coordinate 
axes were chosen to be the principal stress axes, so on surfaces 
with these as normals the normal traction is the only nonzero 
component of the traction vector. 

2.3.5 Maximum shear stress and faulting 

An important seismological application of the principal stresses 
is that the simplest theory for rock fracture predicts that 
faulting will occur on the plane on which the shear stress is 
highest (Section 5.7.2). Although this is not exactly true, it 
gives insight into the relation between fault orientations and 
regional tectonics. 

Given a state of stress, we can find the plane of maximum 
shear stress using the diagonalized stress tensor (Eqn 29), and 
thus a coordinate system whose basis vectors are the principal 
stress axes. By Eqn 11 the traction on a plane with normal 
vector n is 

‘T i = G ii n r <J i 5 n n r a i n i> ( 30 ) 

where summation over i is not implied. The squared magnitude 
of the traction normal to the surface is (T • n) 2 = (Tjftj) 2 , 
so, using the triangular geometry (Fig. 2.3-7), the squared 
magnitude of t, the tangential traction along the surface can be 
written as a function of the components of the normal vector 


r 2 (n v « 2 , n 3 ) = T j T i - {T i n i 

— I r -m \2 i i r 


Gon^r + {<J?n, 




( 31 ) 
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Fig. 2.3-7 Traction vector T acting on the surface dS, decomposed into 
two components. The normal traction is parallel to the normal, n, whereas 
ris the tangential traction parallel to the surface. 


This expression lets us find planes, characterized by their 
normal vectors n, on which T 2 is a maximum. We eliminate n 3 
using the fact that n 2 = 1 - n 2 - n so 

t 2 (n v n 2 ) = n\{o\- cr 2 ) + n\[o\ - a 2 ) + <j\ 

-[nj{a 1 -(J 3 ) + nl{a 2 ~~ a 3 ) + a 3 ] 2 . (32) 

At the maxima of T 2 , its derivatives with respect to n 1 and n 2 
are zero: 


3«i 

= 2m, (a, - a 3 H(cr 1 + cr 3 ) - 2\n\(<J x - er 3 ) 


This equation shows that of the three possible local maxima of 
the tangential traction, the largest value is 

r= (cjj - <t 3 )/2, (35) 

where cq is the maximum principal stress and a 3 is the 
minimum principal stress. This occurs on the planes with unit 
normal vectors 

n = (lA/2,0, l/\/2) and n= (-1/^2,0,IA/2). (36) 

Thus the planes of maximum shear stress are halfway between 
the maximum (1, 0, 0) and minimum (0, 0, 1) principal stress 
axes, and contain the intermediate principal stress axis. The 
derivatives (Eqn 33) are also zero at local minima, correspond¬ 
ing to the principal stress axes where T 2 - 0. 

To apply this theory, consider an experiment in which a rock 
is compressed (Fig. 2.3-8) such that the principal stresses are 
negative, with | o x | > | <r 2 | > | a 3 |. We expect fracture on the 
planes of maximum shear stress. By Eqn 36, there are two such 
planes, each 45° from the maximum and minimum principal 
stress axes and including the intermediate principal stress axis. 
Either plane is equally likely to fracture. Alternatively, if the 
experiment is conducted in a common laboratory situation 
known as uniaxial compression, where | o 1 | > | o 2 \ - | <J 3 1, 
failure should occur on any plane 45° from the maximum 
principal stress (c^) axis. Experiments (Section 5.7.2) support 
the idea that fracture is controlled by shear stress, but in a 
more complicated way such that the fracture plane is often 


+ n|(ff 2 -cr 3 ) + a- 3 ]}, 


dn 2 

= 2n 2 {a 2 - o 3 ){(a 2 + a 3 )-2[n\(a 1 - a 3 ) 

+ q(<7 2 -(7 3 ) + c7 3 ]}. (33) 

The first equation is satisfied if = 0, in which case n\ = 1/2 
satisfies the second equation because the term in braces is zero. 
For these values n\ - 1/2, yielding a plane with unit normal 
n = (0, IA/ 2 , 1A Jl). A second plane is found by setting n 2 = 0, 
so the first equation yields n = (IA/2, 0, IA/2). Eliminating n 1 
from Eqn 31 using the method used for n 2 yields two similar 
equations that can be solved for the third solution, n = (1 A/2, 

lA/2,0). 

Each of these planes bisects the 90° angle between a pair of 
principal stress axes. Because two such planes can be defined 
for each pair of axes, there are other solutions. For example, 
because the condition for n 1 = 0 was that n 2 = n 2 = 1/2, 
n = (0, -1/^2, IA/2) is also a solution. 

To find the value of T 2 as a function of n, we rewrite Eqn 31 

r 2 (w 1? n 2 , n 3 ) = n\n \[cq - a 2 ] 2 + n\n 2 3 [o 2 - a 3 f 

+ n 2 n 2 [c> 1 ~~ <7 3 ] 2 . (34) 





Fig. 2.3-8 Schematic illustration of an experiment in which a cylindrical 
rock sample is compressed along the direction of the maximum principal 
stress until fracture occurs. The minimum principal stresses <7 2 and a 3 
are approximately equal. If fracture occurs on a plane of maximum shear 
stress, the rock breaks on a plane 45° from the direction of maximum 
principal stress. 
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2.3.6 Deviatoric stresses 




Reverse 
or thrust 
faulting 


Large compressive stresses occur at depth within the earth 
due to the weight of the overlying rock. It is convenient in many 
applications to remove the effect of the overall compressive 
stress and consider only the deviations from it. We thus define 
the mean stress 

M = {a n + (J 22 + g 33 )/3 = g h /3 (37) 


Fault planes 
Side view 

°33 ~ °2 


Fault planes 


Map view 



Strike-slip faulting 

Fig. 2.3-9 Stress fields associated with three types of faulting, assuming 
that the earthquake occurred on a plane of maximum shear stress. Normal 
(a), reverse (b), and strike-slip (c) faulting involve different orientations of 
the principal stresses. 


about 25°, rather than 45°, from the maximum principal stress 
direction. 

For simplicity, however, assume that faults in the earth form 
on the planes of maximum shear stress. We will see (Section 
2.3.10) that the earth’s surface is a free surface, where tractions 
must be zero. Hence, at the surface one principal stress axis 
must be vertical, and the other two must be parallel to the 
surface. The three basic fault geometries — strike-slip, normal, 
and thrust — are related to the stress axes (Fig. 2.3-9). If the ver¬ 
tical principal stress is the most compressive, the fault dips at 
45°, and normal faulting occurs. If, instead, the vertical princi¬ 
pal stress is the least compressive, the fault geometry is the 
same, but reverse or thrust faulting occurs. 2 When the vertical 
principal stress is the intermediate principal stress, strike-slip 
motion occurs on a fault plane 45° from the maximum prin¬ 
cipal stress. Thus the geometry of faults, which can be mapped 
geologically or inferred from seismograms of earthquakes, can 
be used to study stress orientations. This model is subject 
to limitations, especially because earthquakes often occur on 
preexisting faults (Section 5.7.2). Nonetheless, the approach 
is useful, especially when integrated with other methods of 
estimating stress directions. 


” Seismologists sometimes use the terms reverse and thrust fault interchangeably, 
whereas structural geologists reserve the term thrust for a shallow-dipping reverse 
fault. 


as j of the sum of the normal stresses, the trace of the stress 
tensor. The mean stress can be related to the principal stresses, 
because the trace of the stress tensor is independent of the 
coordinate system. 

To see that the trace does not change, we write the trans¬ 
formation of the stress tensor between two coordinate systems 
(Eqn 18) in terms of the components, using the summation 
convention (Section A.3.5) 


<*'ij= A ik°ki A ir A ik a ki A ji- 


The trace can be written 




because A is an orthogonal matrix, so that A jk A u = 8 kl . Thus the 
trace is invariant under an orthogonal transformation, and so 
is known as the first invariant of a tensor. The other two invari¬ 
ants (Eqn 25) are also preserved by such transformations. 

The mean stress can thus be written in terms of the trace of 
the diagonalized stress tensor (Eqn 29) 

M = (cj 1 + o- 2 + <t 3 )/3 (40) 

as | of the sum of the principal stresses. The deviatoric stress 
tensor is defined by removing the effect of the mean stress 
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Thus, when the principal stresses are large and nearly equal, 
the deviatoric stress tensor removes their effect and indicates 
the remaining stress state. The deviatoric stress tensor can be 
diagonalized and has the same principal stress axes as the stress 
tensor. 

This concept is important in discussing processes in the 
earth, because the deviatoric stresses result from tectonic forces 
and cause earthquake faulting and seismic wave propagation 
effects like anisotropy. At depths greater than a few kilometers, 
we often assume that a litbostatic state of stress exists, where 
the normal stresses are equal to minus the pressure of the over- 
lying material and the deviatoric stresses are zero. Because 




the weight of a column of material of height z and density 
p is pgz , the pressure at a depth of 3 km beneath a column 
of rock with density 3 g/cm 3 is 

P = (3 g/cm 3 )(980 cm/s 2 )(3 x 10 5 cm) 

- 9 x 10 8 dyn/cm 2 = 0.9 kbar. (42) 

The approximation that the pressure at 3 km depth is about 
1 kbar (100 MPa) is useful to remember. 

The pressure causes compression and thus negative values of 
the principal stresses. If the state of stress at depth is lithostatic, 
the mean stress equals the negative of the pressure. Because 
deviatoric stresses exist, this relation is only approximate, but 
it is useful because the mean stress is usually thought to be 
much greater than the deviatoric stress. 


the two faces, dx t dx 3 , and use a Taylor series to obtain the net 
force due to these two faces, 



^—^^dx 1 dx 1 dx 3 . (43) 

dx 2 


We then do the same for the force in the x 2 direction due to the 
pairs of faces with normals ±ej and ±e 3 . Summing the three 
terms, adding the body force component, and equating this net 
force to the density times this component of the acceleration 
yields 


2.3.7 Equation of motion 

Now that we can describe the forces acting on the surface of a 
material element in terms of the stresses, we write Newton’s 
second law (Eqn 1) in terms of body forces and stresses. This is 
the first step to deriving the equations describing seismic wave 
propagation. 

Consider the forces acting on a block of material of density p 
and volume dx 1 dx 2 dx 3 with sides perpendicular to the coordin¬ 
ate axes (Fig. 2.3-10). The net body force, if any, is f i dx 1 dx 2 dx 3 , 
where f i is the force per unit volume at the center of the block. 
The total force is the sum of the surface forces on each face plus 
the body force within the material. 

For example, the net surface force in the x 2 direction is 
the sum of three terms, each of which describes the net force 
due to the difference in traction between opposing faces. The 
first term involves the difference between the traction in the e 2 
direction resulting from the stress on the face with normal e 2 
and that on the opposite face with normal — e 2 . Because stress 
is force per unit area, we multiply this difference by the area of 


——— + d* 722 + d* 732 dx 1 dx 2 dx 3 + f 2 dx 1 dx 2 dx 3 
dx t dx 2 dx 3 

. p — ff-dx*dx 2 dx-i. 
dt l 


The first three terms give the net force from the tractions on 
opposite faces of the cube. As we saw, each stress component 
canceled with its value from the opposite face, so only the par¬ 
tial derivative of that component contributes to the net force. 
Hence the spatial variation of the stress field, 3 rather than the 
stress field itself, causes a net force. Dividing by the volume of 
the block yields 



3 A field is a quantity that varies in space (Section A.6.1}. 
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Similar equations apply for the x 1 and x 3 components of 
che force and acceleration. The set of three equations can be 
itten simply using the summation convention 


Undeformed 


dCjifc 


fiM 


d 2 w*(x, t) 


Here the fact that the stresses, forces, and displacements can 
vary in both space and time is explicitly written. Alternatively, 
because the stress tensor is symmetric, we can write 

+ f.(x, t) = (47) 

dxj dt 2 

Note that the force in the i direction is obtained by summing 
over the faces /' of the block. If the partial derivative with 
respect to x { is denoted by a comma, Eqn 47 becomes 

m 

This equation, called the equation of motion , is satisfied 
everywhere in a continuous medium. It expresses Newton’s 
second law, F = ma, in terms of surface and body forces. The 
acceleration results from the body force and ov ; -, the diver¬ 
gence of the stress tensor. A stress field that does not vary with 
position has no divergence, and hence produces no force. It is 
interesting to note that the divergence of the stress tensor gives 
rise to a force, which is a vector, just as the divergence of a 
vector yields a scalar (Section A.6.3). 

An important form of the equation of motion describes a 
body at equilibrium, whose acceleration is zero, so the diver¬ 
gence of the stress tensor exactly balances the body forces 

oi /f/ (x,f)=-/;-(x,f). (49) 

This equation of equilibrium must be satisfied for any static 
elasticity problem, such as finding the stresses due only to 
gravity. 

Another important form, if no body forces are applied, is 



Fig. 2.3-11 Geometry showing how deformation arises from the relative 
displacement 8u between two points originally separated by 8x. 


results. The strain tensor describes the deformation resulting 
from the differential motion within the body. 

Consider an element of solid material within which displace¬ 
ments u(x) have occurred. If a point originally at x is displaced 
by u (Fig. 2.3-11), we describe the displacement of a nearby 
point originally at x + <5x by expanding the components of the 
displacement vector in a Taylor series, 

u-{x + 8x) ~u i {x)+ d..- 8X, = ufx) + 5u t , (51) 

dXj 

so that the relative displacement near x, du^ is to the first order 


* 3«i(x) * 

OUj = - - - OX:. 

dX: 


where the partial derivatives are evaluated at x. 

Although we are interested in deformation that distorts 
the body, there can also be a rigid body translation or a rigid 
body rotation, neither of which produces deformation. To dis¬ 
tinguish these effects, we add and subtract du-/dx i to Eqn 52 
and then separate it into two parts 


*)= P 


d 2 u.{x, t) 


1 du, d u j if du t 

2 dX: dX: 2 dX: 

V / 1 J V / 


dU: 


This is called the homogeneous equation of motion, where 
“homogeneous” refers to the lack of forces, as in the termino¬ 
logy of linear equations (Section A.4.4). This equation describes 
seismic wave propagation except at a source, such as an earth¬ 
quake or an explosion, where a body force generates seismic 


The CO- term corresponds to a rigid body rotation without 
deformation. To see this, note that because 0 ) f . is antisymmetric 
( (q = -(00, the diagonal terms are zero, and there are only three 
independent components. We can then form a vector (O with 
components 


2.3.8 Strain 

If stresses are applied to a material that is not rigid, points 
within it move with respect to each other, and deformation 


®k = £ stk®st l2 i 

where £ stk is the permutation symbol (Eqn A.3.39). Using the 
identity 
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"'V/N/V/V* 


£ ijk £ stk £ kij £ kst~ ( 55 ) 

we find that 

£ ijk®k ~ £ ijk £ $tk®$t^ ~ (®ij ~ ®ji)^ ~ ®if ( 56 ) 

Thus the last term in Eqn 53 can be written as 

co if 8x j = e ijk a) k 8x j = -tu x <Sx, (57) 

which is the displacement from a rigid rotation of |a>| about an 
axis in the co direction (Eqn A.3.31). Hence this term does not 
reflect deformation. 

The other term in Eqn 53, e is the strain tensor , a sym¬ 
metric tensor describing the internal deformation. Its tensor 
components 
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are spatial derivatives of the displacement field, u(x). If the dis¬ 
placement field does not vary, its derivatives are zero, so there 
is no deformation, only a rigid body translation. 

The strain tensor can be written in terms of the x, y , z axes 
using the derivatives of the displacement vector components 
(u x ,u y ,u z y. 
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(59) 


The off-diagonal components describe changes along a co¬ 
ordinate axis of displacement in another direction. A simple 
case (Fig. 2.3-12c) is when only u x ^ 0, but u { changes only 
along the x 2 axis, so only e 12 and e 21 are nonzero. We can also 
have both du x ldx 2 and du 2 /dx x nonzero (Fig. 2.3-12d, e). 
Depending on the relative values of the derivatives, the strain 
components describe various deformations. 

The strain tensor can be characterized by its eigenvectors, 
the principal strain axes, and associated eigenvalues, the prin¬ 
cipal strains. The strain tensor is diagonal when expressed in a 
coordinate system whose basis vectors are the principal strain 
axes. The trace or sum of diagonal terms of the strain tensor, 


_ du t du 2 du 3 — 

9^e u = ——L + —A + —A = V-u, 


3x a dx 2 dx 3 


(60) 


known as the dilatation , equals the divergence of the displace¬ 
ment field u(x). The dilatation has physical significance because 
it gives the change in volume per unit volume associated with 
the deformation. To see this, note that in the principal strain 
axes coordinate system a block of material with initial volume 
dx 1 dx 2 dx 3 has a volume after deformation (Fig. 2.3-13) of 


3m,'' 

dx x 

f du ,1 
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du? ^ 

1 + —- 

1 + ^-1 

1 + ^-i 
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9*2 J 


l 9 x 3 J 


(61) 


which, to first order, 


du i du 2 dui ^ 
1 + —- + —- + —- 
dx x dx 2 dx 3 


dx x dx 2 dx 3 = (1 + 6) dx t dx 2 dx 3 . (62) 


Thus, if we define the initial volume as V= dx x dx 2 dx 3 , 


V+ AV= (1 + 0)V, so 9= AV/V, 


(63) 


and the dilatation is the change in volume per unit volume. 

It is worth noting that we have discussed the strain tensor in 
Cartesian coordinates. This tensor is more complicated when 
formulated in other coordinate systems, because it involves 
spatial derivatives of the basis vectors (Section A.7.4). 


The components of the strain tensor are dimensionless 
because they have units of length divided by length. The com¬ 
ponents are of two different types. The diagonal components 
show how the displacement in the direction of a coordinate axis 
varies along that axis. For example, if displacement occurs 
only in the x x direction (u 2 = 0, u 3 = 0) and u x changes only in 
that direction, then the only nonzero term in the tensor is e lv 
Extension occurs along the x x axis if du-Jdx-^ > 0 (Fig. 2.3-12a), 
whereas contraction occurs if it is negative (Fig. 2.3-12b). If e u 
were constant within the material, it would equal the change in 
length per unit length along the x x axis. The other diagonal 
terms, e 21 and e 33 , represent similar strains along their coordin¬ 
ate axes. 


2.3.9 Constitutive equations 

Various materials respond differently to an applied stress. For a 
given stress, a more rigid material responds with smaller strains 
than occur in a less rigid material. The relation between stress 
and strain is given by the material’s constitutive equation . 

The simplest type of materials are linearly elastic , such that 
there is a linear relation between the stress and strain tensors. 
We will see that when the earth behaves as linearly elastic, it 
gives rise to seismic waves. Linear elasticity is valid for the short 
time scale involved in the propagation of seismic waves, but not 
for longer time scales. On time scales of thousands of years or 
longer, the earth flows as a viscous fluid (Section 5.7.3). 
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Fig. 2.3-13 Change in volume of a small block of material with faces 
normal to the coordinate axes, due to the principal strains. The fractional 
change in volume is the dilatation, the sum of the principal strains. 


In assuming that material is elastic, we also assume that the 
displacements from an unstrained initial state are small. This 
assumption, known as infinitesimal strain theory , is generally 
valid for seismic waves. For example, a body wave may have a 
displacement on the order of 10 microns, and a wavelength on 
the order of 10 km. Expressing all quantities in meters, the res¬ 
ulting strain is about (10~ 5 /10 4 ) = 10 -9 , certainly small enough 
for infinitesimal theory to be valid. However, for strains greater 
than about 10~ 4 , the linear relation between stress and strain 
fails. This occurs in regions of the earth’s mantle under very 
high pressure, or when rocks break during an earthquake 
(Section 5.7.2). 

The stress and strain for a linearly elastic material are related 
by a constitutive equation called Hooke’s law , 


written here using the summation convention. The constants 
the elastic moduli , describe the properties of the material. 
To understand how the elastic moduli affect the equation of 
motion, we write the constitutive equation (64) using the fact 
that the strains are derivatives of the displacement, 
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S 


a ir c ,jkt u k,i- < 65 > 

Substituting this expression in Eqn 48 gives the equation of 
motion in terms of the displacements: 

d^u- (x, t) 

o^.(x, t) + /)(x, t) = [c ijkl u Kl )^ (x, t) + / -(x, f) = P —^—. (66) 

Thus the elastic moduli control how displacements evolve in 
time and space in response to an applied force, and so, as we 
will see in the next section, determine the velocity of seismic 
waves. 

The elastic moduli c- kl form a more complicated tensor than 
we have dealt with so far. It has four subscripts and relates 
the stress and strain tensors, each of which have two sub¬ 
scripts. This situation is analogous to the way in which the 
stress tensor, with two subscripts, relates the normal and trac¬ 
tion vectors, each with one subscript. Because the subscripts 
each range from 1 to 3, c, kl has 3 4 , or 81, components. Fortun¬ 
ately, the number of independent components is reduced by 
symmetry considerations. The stress and strain tensors are 
symmetric 

C ijkl ~ C jikl'> C ijkl ~~ C ijlk > ( 67 ) 

so the number of independent components is 36 because there 
are 6 independent components of the stress and strain tensors. 
A further symmetry relation 

c tjkl = c khp 

based on the idea of strain energy, which we will discuss later, 
reduces the number of independent components that charac¬ 
terize a general elastic medium to 21. 

On a large scale, material within the earth has approxim¬ 
ately the same physical properties regardless of orientation, a 
condition known as isotropy. For an isotropic material, the c- kl 
have further symmetries, so there are only two independent 
elastic moduli, which can be defined in various ways. One 
useful pair are the Lame constants A and /i, which are defined 
such that 

c ijki = xs ij s kt + ^ s ik s n + S il S ik )• < 69 > 

In terms of the Lame constants, the constitutive equation 
(Eqn 64) for an isotropic material is written 

+ + (70) 

where 6 is the dilatation. So, for example, a n = X6+2pe u , and 
cr 12 = 2pe n . We will use this constitutive relation to study 
seismic waves in the next section. We will also see that the 
velocities of seismic waves depend on the elastic moduli, so 
in an isotropic material the velocities of seismic waves do not 


depend on the direction in which they propagate. Deviations 
from isotropy occur in many parts of the earth, notably in the 
oceanic lithosphere and at the base of the mantle (Section 3.7). 

Although the completely describe the behavior of an 
elastic material, they are hard to visualize. This is also true for 
the Lame constant A. 4 By contrast, p, called the rigidity or shear 
modulus , has a simple physical interpretation. Consider the 
response of an isotropic elastic body to an applied shear stress 
<j 12 . In this case, the term in the constitutive equation (Eqn 70) 
involving the dilatation is zero (recall that ^12 = 0), so only a 
shear strain, e l2 = G l2 /2p, results. The response to shear is thus 
described by the rigidity, p must be nonnegative, so the sense 
of strain is consistent with the applied stress (consider Fig, 
2.3-12c). A material with large p is quite rigid and responds to 
a given stress with a small strain. By contrast, a given shear 
stress produces a larger strain in a material with lower rigidity. 
A material in which p is zero cannot support shear stresses, and 
corresponds to a perfect fluid, one with zero viscosity. In such a 
fluid, the stress tensor is diagonal in any coordinate system, and 
the pressure equals the negative of the mean stress. Although 
perfect fluids do not exist, 5 the ocean can generally be treated 
this way for seismic waves incident on the sea floor. Even more 
surprisingly, the hot iron fluid thought to comprise the earth’s 
outer core can be described as an ideal fluid for seismological 
purposes. 

Other elastic constants that can be defined in terms of simple 
experiments are often useful. The incompressibility , or bulk 
modulus , K , is defined by subjecting a body to a lithostatic 
pressure dP, such that 

do^-dPSy (71) 

For an isotropic elastic body, the resulting strains, from Eqn 70, 
are 

-dP8 ij = Xd68 ij + 2pde ir (72) 

Setting i - j and summing yields 

-3dP = 3Xd9+2pde, (73) 

because 8 U = 3. The bulk modulus is thus the ratio of the pres¬ 
sure applied to the fractional volume change that results: 

-dP ? 

K = — = A + -p. (74) 

d6 3 

The term incompressibility is apt because the larger the value of 
K, the smaller the volume change produced by a given pressure. 
K is greater than zero, because otherwise objects would expand 

4 Unfortunately, this Lame constant is not only hard to interpret; it has no 
common name and is denoted by the same symbol as is used for wavelength. 

5 Perfect fluids have been called “dry water” to illustrate that no real fluid behaves 
exactly this way. 
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when compressed . 6 In an ideal fluid, K = 2, so in this case 2 has 
an easy physical interpretation. 

Writing the constitutive equation (70) in terms of K and p, 

Gij = K68- + 2fi(e if - Qd^/3) (75) 

shows that the response to an applied stress has two parts: a 
volume change characterized by K and a shear deformation, 
or change in shape, characterized by p. 

Two other elastic constants are defined by pulling the mater¬ 
ial along only one axis, leading to a state of stress called 
uniaxial tension. If the tension is applied along the axis, then 
by Equation 70, 

et] | = (X + 2p)e il + Xe 2 2 + Xe^ 

= 0 = Xe^ + (X+2fi)e22 + Xe^ 

O 33 = 0 = Xe^i + Xe 2 2 + (2 + 2 p)e^y (76) 

Subtracting the last two equations shows that e 21 - £ 33 , so 


e ll ~ e 33 


-2 

- e n = -ve- 

2(2 + p) 


(77) 


where v, defined as Poisson's ratio , gives the ratio of the con¬ 
traction along the other two axes to the extension along the 
axis where tension was applied. Substituting in the first line in 
Eqn 76 yields 


o u _ ^(32 + 2 p) 
&n X + p 


(78) 


where E is called Young's modulus , the ratio of the tensional 
stress to the resulting extensional strain. 

The elastic constants £, v, and K are often used in engineer¬ 
ing because they are easily measured by simple experiments. 
However, for seismic wave propagation, 2, jii , and sometimes 
K are more natural constants . 7 Box 2.3-1 gives conversions 
between the various elastic constants. 

Many seismological problems are simplified by assuming 
that X-fi. Such a material, called a Poisson solid , is often 
a good approximation for the earth. In this case, Poisson’s 
ratio equals 0.25, Young’s modulus E = (5I2)p, and the bulk 
modulus K = (5/3) fa. 

Because strain is dimensionless, the elastic constants 2, p, £, 
and K all have dimensions of stress. For the earth’s crust, p is 
approximately 3 x 10 11 dyn/cm 2 . For comparison, the rigidity 
of steel is about 8 x 10 11 dyn/cm 2 . Young’s modulus for the 
crust, assuming a Poisson solid, is 7.5 x 10 11 dyn/cm 2 , com¬ 
pared to 5 x 10 9 dyn/cm 2 for rubber. 


6 Such strange materials have been manufactured synthetically. 

7 In engineering the shear modulus jl is often termed G. 


IKK 



2.3.10 Boundary conditions 

For a string (Section 2.2.3), wave propagation across an 
interface depends on boundary conditions that relate the dis¬ 
placements and tractions across the interface. In the earth, we 
conduct similar analyses for three types of interface. 

The boundary conditions at the earth’s surface are derived 
for most seismological purposes by neglecting the atmosphere 
and treating the surface as a boundary between a solid and a 
vacuum. In this approximation, the earth’s surface is a free 
surface , not subject to any force. At a free surface with normal 
n the traction vector is zero, giving a constraint on those stress 
components that affect the components of the traction: 

T f =Oi /W/ . = 0. (79) 

Thus, in a coordinate system in which the surface is horizontal, 
the normal vector is n i-d i3 , and T 2 -= o i3 n 3 , so 

^13 “ °23 “ ~ O' 

The components of the stress tensor that do not affect the 
tractions, in this case <7 n , (J 12 , and o 21 , are unconstrained. 
Similarly, no restriction is placed on the displacements. A free 
surface corresponds in the one-dimensional case to a string 
whose end is free to move. 

There are also interfaces between two solids, a solid and 
a liquid, and between two liquids. Their boundary conditions 
are obtained by considering a volume, sometimes called a 
Gaussian pill box , along the interface between different mater¬ 
ials (Fig. 2.3-14). The volume’s long axis is along the interface, 
so the surface area, 5, is large relative to the volume, V. We 
integrate the homogeneous equation of motion (Eqn 50) over 
the volume 
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Medium 1 
Medium 2 


Tft- 

Fig. 2.3-14 “Gaussian pill box” used to formulate the boundary 
conditions across an interface. Application of the divergence theorem 
shows that the traction vector must be continuous across the interface, 
but that the entire stress tensor need not be. 



r* / 

V 


3 2 «,(x, t) 


dt 2 


dV = 0, 


(81) 


and use the divergence theorem (Eqn A.6.10) to transform the 
first term to a surface integral, giving 


cr.-( x , t)n f dS - 


3 2 «;(X, t) 

Io — 1 - 

dt 2 


dV = 0, 


(82) 


where n■ is the / component of the unit outward normal vector 
at each point on S. In the limit as the thickness approaches zero, 
the volume integral becomes negligible, so 


Table 2.3-1 Boundary conditions. 


Interface 

Boundary conditions 


solid-soiid 

c 

-■+-■+ 
ii ii 

_C JH 


solid-liquid 

r 2 =r,=o 

U 3= U i 


free surface 

r i = ° 



At the interface between two solids, sometimes called a 
“welded” interface, all components of the displacement are 
continuous because no overlaps or tears occur. For the same 
reason, the tractions are continuous. This is the condition we 
used at the junction between two strings in Section 2.2.3. 

At the interface between a solid and a perfect fluid the fluid 
can slip along the interface because its rigidity is zero, so it 
cannot support shear stress. Hence the components of traction 
tangential to the interface are zero in the fluid and, by the 
condition of continuity, in the solid as well. Thus the tan¬ 
gential displacement components need not be continuous, but 
the normal components of the traction and displacement are 
continuous. 

Table 2.3-1 summarizes the boundary conditions for a 
horizontal interface between different media. 


J o^x, t)n-dS = 0. (83) 

Because the thickness goes to zero, we neglect the ends, so that 
for the integral to be zero, the contributions from the top (+) 
and bottom (-) surfaces must satisfy 

(^;) ++ (^;r = °* < 84 > 

Hence, because the unit normal on top is opposite that on the 
bottom {n'j = -n~ ), the three components of the traction vector, 
T i = o-wy, must be continuous across the interface. 

The continuity of traction leads to conditions on specific 
stress components, depending on the orientation of the inter¬ 
face. For example, if the interface is horizontal, then n- = 5 /3 , so 

T; = = a a < 85 > 

must be continuous. If, instead, the boundary between two 
solids were vertical, then ”r s n , so 

T i=°ij 5 fl = °n < 86 > 


2.3 .11 Strain energy 

Because applying a force to an elastic material causes deforma¬ 
tion, potential energy is stored within the material, as we saw 
for waves on a string (Section 2.2.4). To motivate this elastic 
strain energy , consider a spring with a restoring force f= -kx. 
Compressing the spring a distance dx requires work against 
the spring, equal to the integral of the force applied times the 
distance. If the spring is initially at equilibrium, the work is 


W 


kxdx = — kx 2 , 
2 


(87) 


which equals the potential energy stored in the spring. 

By analogy, the strain energy stored in a volume is the integ¬ 
ral of the product of stress and strain components summed 



G^dV^ 


1 

2 


c ijki e ,j e u dv - 


( 88 ) 


would be continuous. Because the continuity conditions are 
for tractions rather than stresses, the stress components not 
involved in the traction condition need not be continuous. 


The strain energy is symmetric in ij and kl , providing the 
rationale for the statement (Eqn 68) that the tensor of elastic 
constants has the symmetry c { - kl = c kli -. 
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?,4 Seismic waves 


a j j = X68 ij + 2ne^, 


(3) 


2,4.1 The seismic wave equation 

The ideas of elasticity in the last section let us show that the 
equation of motion has solutions that describe the two types 
of propagating seismic (or elastic) waves, compressional and 
shear waves. We will see that these wave types propagate 
differently, with velocities that depend in different ways on 
the elastic properties of the material. Our approach to show¬ 
ing that the equations of elasticity have propagating wave 
solutions is conceptually similar to the way we showed (Sec¬ 
tion 2 . 2 ) that the physics of a string gives rise to traveling 
waves. In that analysis, we first demonstrated that waves 
occur on a uniform string, and then considered how waves 
propagate between strings of differing properties. That ana¬ 
lysis considered propagating waves without regard to how 
they were generated. 

Following that approach, we consider a homogeneous 1 re¬ 
gion, one of uniform properties, within an elastic material. We 
assume that the region contains no source of seismic waves, 
which requires a body force. Once the waves propagate away 
from the source, the relation between the stresses and dis¬ 
placements is given by the homogeneous equation of motion, 
which includes no body force term, so F = ma becomes 


= P 


3 2 w f (x, t) 
dt 2 


( 1 ) 


and write the strains in terms of displacements, which yields 
dw 


o xx = X6+2ne xx = Xe+2v—- 

OX 


o xy = 2ye xy =n 


xy 




du x 

+ 

dy 

dx 

du x 

du 7 
+ —- 

dz 

dx 


(4) 


We then take derivatives of the stress components 


^ = A 2 ® +2/t ^ 

dx dx dx 2 


do. 


xy 


dy 

do 

Xa 

dz 




dy 2 

dydx 

d 2 u x 



dz 2 dzdx 


(5) 


using the fact that for a homogeneous material the elastic 
constants do not vary with position. Finally, substituting the 
derivatives into the equation of motion and using the defini¬ 
tions of the dilatation 


Before solving the equation, two points are worth noting. 
The equation of motion can be written and solved entirely in 
terms of displacements, because the stress is related to the 
strain, which is formed from derivatives of the displacement. 
The stress and strain are related by the constitutive relation, 
which characterizes the material. Thus, although the equa¬ 
tion of motion does not depend on the elastic constants, the 
solution does. Second, the equation of motion relates spatial 
derivatives of the stress tensor to a time derivative of the dis¬ 
placement vector. The resulting solutions give the displacement 
vector and hence the strain and stress tensors as functions of 
both space and time. Often, for simplicity, these dependences 
are not explicitly written. 

We solve Eqn 1 in Cartesian (x, y, z) coordinates, beginning 
with the % component, 

d<j xx (*,t) ^ do X y(x,t) | d<J xz (x, t) 9 2 ^(x, t) (2) 


3x 


dy 


dz 


dt 2 


du du v 
e = y. u = SLx.^—L 


du„ 


dx dy dz 
and of the Laplacian (Section A.6.5) 


( 6 ) 


V 2 (u) = ^ 


d 2 u„ d 2 u * 


dx 2 dy 2 dz 2 


( 7 ) 


yields 


(A + ^ + flV 2 (u x ) = 


( 8 ) 


for the x component of the equation of motion ( 1 ). 

Similar equations can be obtained for the y and z compon¬ 
ents of displacement. The three equations can be combined, 
using the vector Laplacian of the displacement field 


To express this in terms of displacements, we use the constitut¬ 
ive law for an isotropic elastic medium (Eqn 2.3.70), 


V 2 u = (V 2 u x ,V\,V\), 

into a single vector equation: 


(9) 


Unfortunately, this word is used for two different concepts: a homogeneous 
medium has properties that do not vary with position, whereas a homogeneous equa¬ 
tion has no forcing function or source term. 


(2 + /i)V(V • u(x, t )) + jUV 2 u(x, t) = p 


3 2 u(x, t) 
dt 2 


( 10 ) 
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This is the equation of motion for an isotropic elastic medium 
written entirely in terms of the displacements, with the depend¬ 
ence on position and time explicitly written to remind us that 
we seek a solution that varies in this way. Equation 10 can be 
rewritten using the vector identity (Eqn A.6.23) 

V 2 u = V(V * u) - V x (V x u) (11) 

to obtain 

(X+ 2 ;U)V(V • u(x, t))-/j.V x (V xu(x, t)) = ( 12 ) 

dt 2 

Rather than solve Eqn 12 directly, we express the displace¬ 
ment field in terms of two other functions, (p and T, which are 
known as potentials; 

u(x, t) = V 0 (x, t) + VxTjx, t). (13) 

In this representation, the displacement is the sum of the gradi¬ 
ent of a scalar potential , 0 (x, t), and the curl of a vector poten¬ 
tial 2 Y(x, £), both of which are functions of space and time. 
Although this decomposition appears to introduce complexity, 
it actually clarifies the problem, because the vector identities 
(Section A.6.4) 

V x (V 0 ) = 0 V • (VxY) = 0 (14) 

separate the displacement field into two parts. The part associ¬ 
ated with the scalar potential has no curl or rotation and gives 
rise to compressional waves. Conversely, the part associated 
with the vector potential has zero divergence, causes no volume 
change, and corresponds to shear waves. Because taking the 
curl discards any part of the vector potential that would give a 
nonzero divergence, we require that the vector potential satisfy 

V * Y(x, £) = 0 . 3 

Substituting the potentials into Eqn 12 and rearranging 
terms using Eqn 14 yields 

d 2 

(A + 2 /i)V(V 2 p) -pVx Vx (Vx Y) = p —(V 0 + VxY). (15) 

dt 

Using Eqn 11, the second term of Eqn 15 simplifies to 

Vx Vx (Vx Y) = -V 2 (VxY) + V(V • (Vx Y)) 

= -V 2 (V xY), (16) 

because the divergence of the curl is zero. After this substitu¬ 
tion, the terms in Eqn 15 can be regrouped to give 


2 Although is often used for the vector potential, we use T (upsilon) to avoid con¬ 
fusion with the S V potential in the text section. 

3 This decomposition into scalar and vector potentials, known as Helmholtz decom¬ 
position, can be done for any vector field. 


(A + 2 /z)V 2 0 (x, t) - p 
^V 2 T(x, t)-p 


9 2 <p(x, t) 


= -V x 


dt 2 
3 2 T(x, t) 


dt 2 


(17) 


because the elastic constants do not vary with position, and 
the order of differentiation has no effect. 

One solution of the equation can be found if both terms in 
brackets are zero. In this case, we have two wave equations, 
one for each potential. The scalar potential satisfies 


V 2 0 (x, t) = 


1 3 2 </>(x, t) 


a 1 


dt 2 


with the velocity 
a =[(2 + 2 p)/p] 1/2 . 


(18) 


(19) 


As we will see shortly, this solution corresponds to P, or com¬ 
pressional, waves. Similarly, the vector potential satisfies 


V 2 Y(x, t) = 


1 d 2 Y(x, t) 


dt 2 


with velocity 

p=di/p) y2 . 


( 20 ) 


( 21 ) 


and corresponds to 5, or shear, waves. 

Equations 18 and 20 are wave equations that are slightly 
different from those that we have previously encountered. 
Waves on a string (Section 2.2) satisfied the wave equation 


d 2 u(x , t) 1 d 2 u(x, t) 


dx 2 


dt 2 


( 22 ) 


describing the propagation of a scalar quantity in one space 
dimension. The scalar potential satisfies a similar scalar wave 
equation, with the difference that the space variable x is in 
three dimensions. The vector potential, a vector quantity, satis¬ 
fies the analogous vector wave equation in three dimensions. 

The wave equations in Eqns 18 and 20 are strictly valid only 
for a homogeneous medium because they were derived assum¬ 
ing that all derivatives of the elastic constants were zero. Al¬ 
though these equations were derived in Cartesian coordinates, 
they are valid in any coordinate system. We next discuss solu¬ 
tions of the wave equation, and then return to these two types 
of waves. 


2.4.2 Plane waves 

The scalar wave equation in three dimensions, 


V 2 0 (x, t) = 


1 3 2 0 (x, t) 
1/ dt 1 


(23) 
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describes how the scalar field 0(x, f) propagates in three dimen¬ 
sions. By analogy to the equation of motion (Eqn 2.3.50), 
Eqn 23 is a homogeneous wave equation, with no forcing 
function to act as a source of the waves. If there were, the 
inhomogeneous scalar wave equation in three dimensions with 
a source term f(x, t), 


V 2 0(x, *) - 


1 d 2 0(x, t) 


= f (x, t). 


(24) 


would apply. 

The harmonic wave solution to the scalar wave equation in 
one dimension (Eqn 2.2.6) 

u(x,i) = Ae*‘° t±kx 'i (25) 

can be generalized to solve the three-dimensional scalar wave 
equation. This solution, known as a harmonic plane wave , is 
written 4 


0(x, t)= A exp (*(atf±k-x)) 

. =Aexp{i{cot±k x x±k y y±k z z)), (26) 

where x is now the position vector, and k = ( k x , k y , k z ) is now 
the wave vector , sometimes also called the wavenumber vector. 
This solution describes a plane wave propagating in an arbit¬ 
rary direction given by the wave vector, in contrast to the 
one-dimensional solution that describes propagation along a 
coordinate axis. To demonstrate this, we write k = | k | k, where 
k is a unit vector in the direction of k; so Eqn 26 becomes 

0(x, t) = A exp {i[cot -1 k |(k • x)]), (27) 

a plane wave propagating in the k direction with velocity 


v~coi | k |. 


(28) 



Fig. 2.4-1 Wave fronts for a harmonic plane wave traveling in the 
direction indicated by the wave vector k. The wavelength is X- 2nl\ k |. 


This solution to the three-dimensional scalar wave equation 
can be generalized to solve the vector wave equation in three 
dimensions, 


V 2 T(x, t) = 


i a 2 r(x,t) 

v 2 dt 2 


(29) 


which describes the propagation of a vector field. In Cartesian 
coordinates this breaks up into three scalar wave equations: 


v 2 Tt;(x, o = 


V 2 Y y (x, t) = 


v 2 r z (x,t) = 


1 

9%(x, t) 

v^ 

dt 2 

1 

d%(x, t) 

l/ 2 

dt 2 : 

1 

d 2 YJx,t) 

I/ 2 

dt 2 


(30) 


Thus the wave vector describes two important features of a 
propagating wave. Its magnitude gives the wavenumber, the 
spatial frequency, and its direction gives the direction of pro¬ 
pagation. The wave fronts, which at any time are surfaces 
of constant phase {cot - k • x) and thus constant values of 
0(x, t), are planes perpendicular to the direction of propagation 
(Fig. 2.4-1). To see this, note that all points on a plane perpen¬ 
dicular to the wave vector have the same value of k • x, because 
this scalar product is the projection of k on x. The phase is 
periodic over a distance along the propagation direction equal 
to the wavelength, 2iti\ k |. As for the waves on a string, we can 
use the complex exponential formulation so long as we ensure 
that the displacement is purely real, either by taking the real 
part of the complex exponential or by also using the complex 
conjugate. 

4 When the arguments of exponentials become lengthy, we sometimes use the nota¬ 
tion exp (x) = e x for clarity. 


The harmonic plane wave solution to the vector wave equation 
is then 

Y(x, t) = A exp {i{cot-k • x)), (31) 

which is like Eqn 26 except that Y(x, t) and the constant A are 
vectors. 

2.4.3 Spherical waves 

A second solution to the three-dimensional scalar wave equation 
yields waves with spherical, rather than planar, wave fronts. 
To obtain this solution, we express a scalar potential, 0(r, t), 
and its Laplacian in spherical coordinates (Eqn A. 7.17). We 
consider spherically symmetric solutions where 0 is a function 
only of time and the radius r, so only the d(f)ldr term in the 
Laplacian is nonzero. The spherically symmetric waves satisfy 
the homogeneous wave equation 




V 2 0(r, t) 


(32) 


I 


_ 1 3 2 30(r, *) _ 1 d 2 (/}(r , t) 

r 1 dr . dr v 2 df 2 


where the space variable is the radius r rather than the position 
vector r. To solve this equation, we substitute 

0(r, £) - §(r, t)ir (33) 

and obtain 


1 d 2 % 1 d 2 ^ 


Because the term in brackets is the scalar wave equation in one 
dimension, any function of the form £~f(r±vt) satisfies Eqn 34 
when r ± 0. Thus any function of the form 

</>(r,t)=f(t±rlv)/r (35) 

is a spherically symmetric solution to the scalar wave equation. 

This solution describes spherical wave fronts centered about 
the origin r = 0, whose amplitude depends on the distance from 
the origin. When the minus sign is used, Eqn 35 represents 
waves diverging outward from a source at the origin, with the 
amplitude decaying as Hr. The plus sign yields an incoming 
spherical wave, growing in amplitude as 1/r and converging at 
the origin. It is common to impose a radiation condition that 
waves not enter the region of study from far away, and thus to 
discard the incoming wave solution. 

However, Eqn 35 is not a solution to the homogeneous 
equation everywhere in space, because it is infinite at r = 0. 
Physically this is because a wave spreading out from a point 
must have been generated by a seismic source there. Thus the 
outgoing wave, 0(r, if) = f(t - rlv)lr , is actually a solution to the 
inhomogeneous wave equation 


V 2 0(r, t) 


1_3^M) 

v 1 dt 2 


AnS(t)f(t). 


This represents a point source at the origin with a time function 
f(t). The delta function <5(r) (Section 6.2.5) is zero except at 
r = 0, but its integral over a volume including the origin is 1. 
Thus, integrating over a volume including the origin shows 
that Eqn 35 is a solution to the inhomogeneous scalar wave 
equation (36) even at the origin. Hence, in seeking a solution to 
the homogeneous equation that yielded spherical waves, we 
have found a solution to the inhomogeneous equation which is 
used to study waves radiated by a seismic source. 

The fact that the spherical wave solution (Eqn 35) repres¬ 
ents an outgoing wave generated at the origin explains the 
distance-dependent amplitude factor 1/r, which had no coun¬ 
terpart for the plane wave solution. As a spherical wave 
propagates away from its source, the area of the wave front, 
4;rr 2 , increases. Because, as we will see shortly, the energy per 
unit area of the wave front transported by a propagating wave 



Local plane wave 
approximation 



Fig. 2.4-2 As a spherical wave front moves far from the source, it can be 
locally approximated by a plane wave front due to the decreased curvature 
of the spherical wave. 

is proportional to the amplitude squared, the energy per unit 
wave front decays as 1/r 2 . This decay, called geometric spread¬ 
ing, conserves energy. Similarly, the energy of spherical light 
waves decays with distance from a lamp as 1/r 2 . 

A plane wave can be regarded as a limit of a spherical wave 
far from the source, because the spherical wave front becomes 
almost planar (Fig. 2.4-2). This approximation is often used in 
seismology when seismometers are far from an earthquake. 

2.4.4 P and S waves 

We found earlier in this section (Eqn 13) that the displacement 
can be separated into a scalar potential corresponding to P 
waves that satisfies the scalar wave equation 


V 2 #x, t) = 


1 3 2 </>(x, t) 
a 2 dt 2 


and a vector potential corresponding to S waves that satisfies 
the vector wave equation 


V 2 Y(x, t) 


1 d 2 Y(x, t) 
Z3 2 dt 2 


To understand the displacements caused by the two types of 
waves, consider a plane wave propagating in the z direction. 
The scalar potential for a harmonic plane P wave satisfying 
Eqn 37 is 

</>(z, t) =A exp (i(cot-kz)), (39) 

so the resulting displacement is the gradient 

u(z,t)-'V(l){z,t) = (0,0,-ik) A exp (i(cot-kz)), (40) 

which has a nonzero component only along the propagation 
direction z (Fig. 2.4-3). The corresponding dilatation is nonzero, 
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Fiu. 2.4-3 Displacements produced by 
•dane compressional and shear waves, 
shown by a “snapshot” in time. P waves 
produce displacement in the direction of 
u ;: ve propagation and a volume change. S 
waves produce displacement perpendicular 
m rhe direction of wave propagation and 
distort the material without any volume 
change. 




S waves: ground motion is perpendicular to wave direction 



P waves: ground motion is parallel to wave direction 


V • u(z, t) = -k 2 A exp ( i{cot-kz )). 


■ 


— 

til ill 
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so a volume change occurs. As the wave propagates, the dis¬ 
placements in the direction of propagation cause material to be 
alternately compressed and expanded. Thus the P wave gener¬ 
ated by the scalar potential is called a compressional wave. 

By contrast, for the S wave, or shear wave, described by the 
vector potential 


ru, t) = {A x , A , A ) exp (i(cot-kz)), 


the resulting displacement field is given by the curl 
u(z, £) = VxY(z, t) = (ikA ,-ikA 0} exp ( i(cot-kz )), 


iliti 

11:111 
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whose component along the propagation direction £ is zero 
(Fig. 2.4-3). Thus the only displacement associated with a 
propagating shear wave is perpendicular to the direction of 
wave propagation. A shear wave causes no volume change, 
because the dilatation, V * u(z, t), is zero. 

Comparison of the displacements for the P and S waves 
illustrates that a wave is characterized by two directions. One 
is the direction in which the wave propagates; the other is 
the direction in which the field that propagates changes. A com¬ 
pressional wave is an example of a longitudinal wave, because 
the propagating displacement field varies in the direction of 
propagation. A familiar example is a sound wave in air, which 
can be described as a compressional (elastic) wave in an ideal 
fluid. By contrast, a shear wave is an example of a transverse 
wave, because the propagating displacement field varies at 
right angles to the direction of propagation. The waves we 
considered on the string were transverse waves, because waves 
moved along the string, but their displacement was normal to 
the string. Electromagnetic waves are another familiar example 
of transverse waves. 

The component of T (z, t ) in the direction of wave propaga¬ 
tion (A z ) has no effect on the displacement field because 
taking the curl discards it. Thus, setting A to zero to satisfy the 


requirement that V • Y(z, t) — 0 imposes no additional re¬ 
striction on the displacement. Only A x and A y contribute to the 
displacement. Because each component of the displacement 
depends on only one of these terms, there can be two independ¬ 
ent shear wave fields. For example, if A x or A y is zero, there will 
be only a y or an x component of displacement. Thus shear 
waves can have two independent polarizations, as is the case 
for other transverse waves, such as light. 

In real applications, we often define the z axis as the vertical 
direction and orient the x-z plane along the great circle con¬ 
necting a seismic source and a receiver. Plane waves traveling 
on the direct path between the source and the receiver thus pro¬ 
pagate in the x-z plane. The shear wave polarization direc¬ 
tions are defined as SV, for shear waves with displacement in 
the vertical (x-z) plane, and SH , for horizontally polarized 
shear waves with displacement in the y direction, parallel to the 
earth’s surface. Both have displacements perpendicular to the 
propagation direction and the other polarization (Fig. 2.4-4, 
overleaf). Although we could choose any two orthogonal 
polarizations in the plane of the shear wave displacements, 
using SV and SH is particularly convenient. We will see that 
P and SV waves are coupled with each other when they interact 
with horizontal boundaries, whereas SH waves remain 
separate. 

Seismometers record horizontal motions in the north-south 
and east-west directions, which rarely correspond exactly to the 
SH and SV polarizations. As a result, data from the horizontal 
components of seismometers are often rotated. The direction 
connecting the source and the receiver, corresponding to SV 
displacements, is called the radial direction, so a seismo¬ 
gram rotated to this direction is called the radial component. 
Similarly, the orthogonal direction corresponding to SH dis¬ 
placements is called the transverse direction, so a seismogram 
rotated to this direction is called the transverse component. 

Because seismograms record components of the displace¬ 
ment vector, they can be rotated to give their components in a 
new coordinate system using Eqn A.5.9. If the back azimuth 
direction from the receiver to the source (Section A.7.2) is 
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Fig. 2.4-4 Displacement fields for plane P and S waves propagating in the 
x-z plane containing the source and the receiver, where the z axis is 
vertical. The P -wave displacement is along the wave vector k. The S wave 
can be decomposed into two polarizations, S V and SH, perpendicular 
to the wave vector. The SH displacement is purely horizontal (in the y 
direction, out of the page), whereas the S V displacement is in the x-z plane. 


we rotate the north-south (NS) and east-west (EW) compon¬ 
ents into radial (R) and transverse (T) components using 



with 9- 3kI2 - fFigure 2.4-5 shows seismograms recorded at 
an angular distance of 110° from a deep earthquake, where the 
top three traces are the components recorded at the station, and 
the bottom two are the radial and transverse components. 
Various P and S wave phases (Section 3.5), corresponding to 
different ray paths between the source and the seismometer, 
can be seen. Because the back azimuth is 323°, SH and SV 
energy is evenly distributed between the north-south and east- 
west components, so the S-wave phases are roughly com¬ 
parable on both components. When rotated, however, phases 
like SKS, SKKS , and PS that involve conversions from P waves 
to SV waves appear primarily on the radial component. Con¬ 
versely, phases like S diff that involve primarily SH energy are 
largest on the transverse component. 

The relative amplitudes on the radial and transverse com¬ 
ponents are shown by a particle motion plot of the amplitudes 
as a function of time (Fig. 2.4-6). As shown for two time seg¬ 
ments from Fig. 2.4-5, the SKS and SKKS waves are primarily 
on the radial or 5 V component, whereas S di ^is primarily on the 
transverse or SH component. 

The definitions of the P-wave velocity, termed a or v p , 

a= [(X+2p)/p] m = [(K + 4p/3)/p] 112 , (45) 




Fig. 2.4-5 Seismograms for a deep (597 km) 
earthquake on August 23,1995, in the Mariana 
trench, recorded 110° away at Harvard, 
Massachusetts. P-wave phases are best seen on the 
vertical component, 5 V-wave phases are best seen 
on the radial component, and SH-wave phases are 
best seen on the transverse component. 


500 s 
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Fig. 2.4-6 Particle motion plots for two time 
segments of the radial and transverse components 
shown in Fig. 2.4-5. SKS and SKKS, which are 
primarily 5 V waves, are strongest on the radial 
component ( left ), whereas 5^is primarily an 
SH wave, and so is strongest on the transverse 
component (right). 





SKS + SKKS S diff 




and 5-wave velocity, termed (5 or v s , 

j5= (/alp ) 112 , (46) 

show that the seismic velocities depend in different ways on the 
elastic constants of the material. Because the rigidity p and the 
bulk modulus K (Eqn 2.3.74) are positive, P waves travel faster 
than 5 waves. Thus the first wave arriving from an earthquake 
is always a compressional wave. As a result, the nomencla¬ 
ture P originally denoted the first-arriving, “primary” wave, 
whereas S denoted the “secondary” wave. 

Although both velocities depend on the rigidity, the shear 
velocity does not depend on the bulk modulus K , because these 
waves involve no volume changes. Because the shear velocity 
is proportional to the square root of the rigidity, shear waves 
cannot propagate through an ideal (p = 0) fluid. However, 
compressional waves propagate in an ideal fluid with a velocity 
proportional to K m . Thus only compressional waves can 
travel through the earth’s outer core or the ocean. 5 

To get a feel for these wave velocities, consider typical values 
for various parameters. The earth’s crust is approximately a 
Poisson solid, with elastic constants X~ p ~ 3 x 10 11 dyn/cm 2 . 
Thus, for a density of 3 g/cm 3 , the P -wave velocity is 5.5 x 
ICP cm/s, or 5.5 km/s. Similarly, the 5-wave velocity is 3.2 x 
10 5 cm/s, or 3.2 km/s. Hence a P wave propagating with a 
velocity of 5.5 km/s and a period of 2 s has a wavelength 
(Section 2.2) of (5.5 km/s x 2 s) or 11km. The frequency 
is 0.5 s _1 (the unit s _1 is called a Hertz, or Hz), and the 
wavenumber is 2;r/ll = 0.57 km"” 1 . On the other hand, a wave 
with a period of 10 s and the same velocity has a wavelength of 
55 km, and a frequency of 0.1 Hz. The longer-period wave has 
a longer wavelength and a lower frequency. 


’ The transverse waves we see at a beach are not seismic waves in the water, but 
instead propagate at the water surface and involve a rolling motion in two dimensions 
similar to Rayleigh waves (Section 2.7.2). 
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Fig. 2.4-7 Seismic spectrum showing the frequencies at which various 
analyses are conducted. 


The “seismic spectrum,” showing seismic waves of various 
frequencies and types, is shown in Fig. 2.4-7. Studies of earth¬ 
quakes typically use the period range from approximately 0.1 s 
to more than 3000 s, or frequencies from 10 Hz to 3 x 10~ 4 Hz 
(0.1 mHz). Higher-frequency waves of 20-80 Hz generated by 
explosions or other artificial sources are used in reflection 
seismology to explore the earth’s crust. Still higher frequencies, 
3-12 x 10 3 Hz (3-12 kHz), propagating primarily in the ocean, 
are used by marine geophysicists to map the sea floor. At the 
other end of the spectrum, ground motions with periods longer 
than 10 4 s are due to slow crustal motions (Section 4.5) rather 
than propagating seismic waves. 

Earthquake sources generate both P and 5 waves, with the 
5 waves generally significantly larger. Figure 2.4-8 shows 
seismograms of the three components (vertical, or up-down, 
north-south, and east-west) of ground motion from seismic 
waves generated by an earthquake -280 km beneath two 
seismic stations in Japan. The seismic waves are coming up 
vertically toward the surface. The first arrival, a P wave, has 
displacement along the direction of propagation, and therefore 
appears primarily on the vertical component. The large later 
arrival, a shear wave, has displacement perpendicular to the 
direction of propagation, and thus appears most on the hori¬ 
zontal components. 
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Fig. 2.4-8 Three-component seismograms 
at two stations from an earthquake beneath 
Japan. Because the stations are nearly above 
the earthquake, the P wave has its largest 
amplitude on the vertical (U-D, “Up”- 
“Down”) components. (Ando etaL, 1983. 
J. Geophys. Res., 88, 5850-64, copyright 
by the American Geophysical Union.) 


MNV September 24, 1982 07:40:24 



Fig. 2.4-9 Three-component seismogram of 
a magnitude 4.9 shallow-focus earthquake 
recorded 64 km away at Mina, Nevada. The 
difference in the arrival times of the P and S 
waves, t s - t P , can be used to estimate the 
distance between the earthquake and the 


Time (s) 


seismometer. 


These data also show an interesting effect. The S wave 
on the north-south components arrives earlier than on the 
east-west components. This observation has been interpreted 
as indicating that material beneath the seismic stations is 
~5% anisotropic, such that in this region shear waves with 
displacements in the N-S direction propagate faster than those 
with displacements in the E-W direction. The anisotropy 
(Section 3.6) may reflect the presence of the mineral olivine, 
in which seismic waves propagate at different speeds depend¬ 
ing on their direction with respect to the crystal structure. 
If enough olivine crystals are oriented in a consistent fashion, 
significant anisotropy can result. A second effect that could 
cause significant anisotropy is the presence of a region of 
aligned cracks. 


Figure 2.4-9 shows a different type of seismogram: a record 
of a shallow earthquake in Nevada from a seismic station 
within 100 km of the source. The times when the P and S waves 
arrive can be measured from the seismograms. With a number 
of such observations at different locations, we will see (Chapter 
7) that the location and origin time of the earthquake can be 
determined. Even with one seismic station, something about 
the location of the earthquake can be learned. Although the 
arrival times of the seismic waves cannot be converted to travel 
times without knowing when the earthquake occurred, we can 
learn something from the difference between the P and S arrival 
times. For typical values of the compressional and shear velo¬ 
cities in the crust, a — 5.5 km/s and ft = 3.2 km/s, the times 
required for S and P waves to travel a distance of x km are 
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t=x! 3.2, t p = x/5.5. (4/) 

he difference in travel times, which is also the difference in 
arrival times, 

y=x(l/3.2 — 1/5.5) = x!7.6, (48) 

thus a function of the distance between the source and the 
receiver. Because the S wave arrives about 8 s after the P wave, 
rhe earthquake is about 60 km away, in agreement with the dis¬ 
tance found by an earthquake location program using arrival 
times from many seismic stations. This S - P travel time tech¬ 
nique gives an estimate of the distance from the seismometer to 
the earthquake, but does not yield the azimuth and hence the 
location. 1 Given S - P times at several stations, the location can 
be found from the requirement that the earthquake must be a 
pecific distance from each station. Schematically, this method 
n be thought of as locating the point on a map where arcs 
of circles with the appropriate radii intersect. The problem is 
actually more interesting, because the earthquake need not 
have occurred at the earth’s surface. 


2.4.5 Energy in a plane wave 

Like waves on a string (Section 2.2.4), seismic waves transport 
energy both as kinetic energy and as strain, or potential, energy. 
To find this energy, consider harmonic plane S and P waves 
traveling in the £ direction. An SH wave with displacement in 
the y direction is 

u v (z, t) = B cos (cot-kz), (49) 

where this expression is written directly in terms of displace¬ 
ment, rather than potential. We will see shortly that this is a 
useful approach for SH waves. 

The kinetic energy in a volume V is the integral of the sum 
of the kinetic energy associated with each component of the 
displacement 


The strain energy (Eqn 2.3.88) is 


because the mass is m = pdV. Hence for the plane wave 
(Eqn 49), the kinetic energy per unit wave front averaged over a 
wavelength X is 


— pB 2 co 2 sin 2 {cot - kz)dz = — pB 2 co 2 — - B 2 co 2 pl4. 
2X 2X 2 


V 

Because the only nonzero strain components are 


?32 = e 2 3 =-- -Bk sin (cot-kz)l2, 

2 dz 


The only nonzero stress components are 

a 32 ^=a 23 = pBk sin (cot-kz), (54) 

and the strain energy per unit area of wave front averaged over 
a wavelength in the propagation direction is 


W=— pB 2 k 2 sin 2 (cot - kz)dz = pB 2 k 2 /4 = B 2 co 2 p/4, (55) 

2X 


where the last expression used the fact that p = fi 2 p and 
pk = co. Thus the strain energy and kinetic energy averaged 
over a wavelength are equal, as we found for the string. Hence 
the total energy averaged over a wavelength is 

E = KE+W=B 2 co 2 pi 2, (56) 

and the average energy flux in the propagation direction is 
found by multiplying by the velocity 

E = B 2 m 2 pP/2. (57) 

The total energy and flux are proportional to the square of 
the amplitude and the frequency, so for waves of the same 
amplitude, the higher-frequency wave transports more energy. 

Similarly, a plane P wave propagating in the z direction, 
described by the scalar potential 

cj)(z, t) = Aexp (i(cot±kz)) (58) 

has a displacement which is the gradient of the potential, 
u( 2 , t) = V(j)(z, t) = (0,0, -ik) A exp ( i(cot-kz )), (59) 

with real part 

u z (z,t) = Ak sin (cot-kz). (60) 

Using Eqn 50, the kinetic energy per unit wave front averaged 
over a wavelength is 


KE= — pA 2 k 2 co 2 cos 2 (cot-kz)dz = A 2 co 2 k 1 p/4. (61 

2X 


1 An analogous method is used to estimate that a thunderstorm is a mile away for 
every 5 s between seeing lightning and hearing thunder, because light travels much 
faster than sound (about 330 m/s in air). 


To find the strain energy (Eqn 52), we note that the only 
nonzero stress component is 

o=[X + lii)e=pa 2 e„, (62) 
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Fig. 2.4-10 Seismograms showing the ground displacement at two 
locations in the Marina district of San Francisco from a magnitude 
5 aftershock of the 1989 Loma Prieta earthquake. The shaking on the 
filled land is about an order of magnitude larger than on bedrock. 
(Courtesy of the US Geological Survey.) 

where the last form eliminates the Lame constant X and lets us 
reserve the symbol X for wavelength. Thus the strain energy per 
unit wave front averaged over a wavelength is 

x 

W= — pa 2 A 2 k 4 cos 2 (cot- kz)dz = A 2 co 2 k 2 pl4, (63) 

XX 

o 

which equals the kinetic energy. Hence the total energy averaged 
over a wavelength is 

E = KE + W = A 2 o) 2 k 2 p/l , (64) 

and the average energy flux in the propagation direction is 
found by mutiplying by the P velocity 

E = A 2 co 2 k 2 pa/2. (65) 

These expressions differ from those for the energy of the SE1 
wave by a factor of k 2 , because A is the amplitude of the poten¬ 
tial, whereas in Eqns 56 and 57 B is the amplitude of the dis¬ 
placement. If we used the potential amplitude for a shear wave, 
the k 2 factor would be needed. 

The energy flux gives insight into how waves behave when 
they change media. For example, as water waves travel into 
shallower water, their velocities decrease, so their amplitudes 
increase to conserve energy. Eventually the amplitudes exceed 
a critical level, and the wave breaks. Similarly, when seismic 
waves pass from bedrock into soft soil with lower velocity and 
density, their amplitudes increase. This effect is shown by 
Fig. 2.4-10, comparing seismograms of an aftershock of the 
Loma Prieta earthquake from the Marina district of San Fran¬ 
cisco. The ground motion recorded by a seismometer located 
on a layer of soft landfill (bottom) is much larger than that on a 
nearby seismometer installed on bedrock (top). As a result, 
earthquake damage varies between structures built in soils and 
bedrock. 


2.5 Snell’s law 

2.5 .1 The layered medium approximation 

In the last section, we saw that the equation of motion for a 
homogeneous elastic medium has solutions in which the dis¬ 
placement is described by potentials satisfying the wave equa¬ 
tion. We now begin to use these solutions to describe seismic 
wave propagation in the earth. Applying results derived for 
an infinite homogeneous medium to a real planet with a com¬ 
plicated internal structure might seem like a large leap. None¬ 
theless, some significant problems can be explored using this 
approach. 

For seismological purposes, we characterize the internal 
structure of the solid earth by the distribution of physical prop¬ 
erties that affect seismic wave propagation and can be studied 
using seismic waves. We thus deal with the distribution of elas¬ 
tic properties and density, or, equivalently, of seismic velocities 
and density. A seismological model of elastic earth structure is 
the set of functions a( r), j3(r), p(r) showing how the velocities 
and density depend on the position vector r, and hence the 
radius, latitude, and longitude. Seismological results indicate 
that this distribution is complicated and difficult to charac¬ 
terize. For example, downgoing slabs of lithosphere extend 
to considerable depths at subduction zones. Fortunately, we 
can often make a series of useful approximations (Fig. 2.5-1). 
Because the solid earth’s physical properties vary significantly 
more with depth than they do laterally, they can be approxim¬ 
ated as spherically symmetric functions a(r), /3(r), p(r) that 
depend only on the radius r. A medium whose properties vary 
only with depth is called laterally homogeneous or stratified , in 
contrast to a laterally heterogeneous medium where velocities 
vary laterally as well as with depth. 

When the characteristic length of the region under consid¬ 
eration is small compared with the radius of the earth—as, for 
example, in local crustal studies—the earth’s curvature can be 
neglected. The earth is thus further approximated as a laterally 
homogeneous halfspace, with velocities and density character¬ 
ized by functions a(z), P(z), p(z) varying only with the depth z. 
A further useful simplification is to treat the earth as a halfspace 
consisting of finite thickness layers, each of uniform properties 

An attractive feature of the layered model is that the solu¬ 
tions of the equation of motion discussed in the last section 
apply exactly only to a homogeneous medium. When a layered 
earth model is appropriate, it is possible to take the homo¬ 
geneous medium solutions in each layer and “patch” them 
together at the interfaces to account for the propagation of seis¬ 
mic waves between layers. This can be done when plane waves 
adequately represent the wave fronts, an assumption that 
applies far enough away from the source that wave fronts 
can be considered planar. Treating a stratified medium as a set 
of uniform layers is analogous to the way we divided a string 
into uniform segments and matched solutions across their 
boundaries. 
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Fig. 2.5-1 Schematic illustration of some types of earth models used in 
seismology. The most accurate model, a laterally heterogeneous sphere, 
is often approximated as being spherically symmetric, with properties 
varying only with radius. A spherically symmetric model can be further 
approximated for many purposes as a stratified halfspace, in which 
properties vary only with depth, or as a layered half space composed 
of discrete uniform layers. 


The real earth is not laterally homogeneous, much less com¬ 
posed of uniform layers, and seismic wave fronts do not extend 
as planes to infinity. The test of whether these approximations 
are useful is whether results derived by applying them to 
seismological data yield geologically meaningful inferences. 
We will see that this is surprisingly often the case. Laterally 
homogeneous models are thus useful both as representations 
of average earth structure and as starting models for more 
detailed investigations. 


2.5.2 Plane wave potentials for a layered medium 

Our first goal is to analyze what happens when a plane P or S 
wave is incident on the boundary between two halfspaces of 
homogeneous and isotropic elastic materials with different 
elastic constants and hence seismic velocities. We will derive 
Snell’s law, the famous relation that describes the bending of 
wave fronts as a plane wave goes from one medium to the 
other. Once we can handle a single boundary, we generalize 
this solution to a stack of homogeneous layers. The layered 
approximation can be used, even when the elastic properties 
vary smoothly, by using a large number of thin layers. 


Medium 2 


- 

z 


Fig. 2.5-2 Two halfspaces in contact, composed of materials with 
different elastic properties. The horizontal interface is in the x-y plane. 


The geometry of the problem is shown in Fig. 2.5-2. We 
consider a plane wave with its direction of propagation, and 
thus wave vector, in the x-z plane. The displacements can be 
written using potentials that are functions only of x and z. Two 
halfspaces of different materials are in contact along a bound¬ 
ary that is the x-y plane, and the z axis, the normal to the inter¬ 
face, is positive downwards. This geometry has the attractive 
feature that the shear waves can be separated into the two 
polarizations discussed in the previous section: SV waves, 
whose displacement is only in the x-z plane, and SH waves, 
whose displacement has only a y component. Moreover, the 
displacement and hence potentials do not vary with y, and so 
can be written as functions of x, z, and t. 

In Eqn 2.4.13 we saw that the displacement field can be 
decomposed into a scalar potential describing P waves and a 
vector potential for S waves. To separate the SV and SH waves, 
we split the vector potential T into two terms, 

Y(x, z, t) = '¥(x, z,t) + V x%(x, z,t). (1) 

The displacement vector can now be written using the scalar 
potential, <j>(x, z, t) 9 and the two vector potentials: 

u(x, z, t ) = V<j)(x, z,t) + VxY(x, z, t) 

= V0(*, z, t) + V x W{x, z, t) + V x V x %{x, z, t). (2) 

We choose the vector potentials to be 

¥(x, z, t) = (0, y/(x, z, t), 0) and 

%{x,z,t) = {0,x(x,z,t),0). (3) 

Each potential has zero for its x and z components, and the y 
components are the scalar functions y/(x, z, t) for SV waves 
and %{x, z, t) for SH waves. Thus the displacement vector is 
described by three scalar functions, one for each potential. 

To find the resulting displacements, we carry out the vector 
operations in Eqn 2. Because the two vector potentials have 
only a y component, and neither (j), iff, nor % depend on y, the 
y derivatives are zero. Hence the P, SV, and SH terms give rise 
to displacement vectors with [x, y, z) components 







Thus P and SV contribute to the x and z components of dis¬ 
placement, whereas SH contributes only to the y component. 
The divergences V • \P and V • % equal zero because only their 
y components are nonzero, and 3/ dy of these components is 
zero. Hence, as expected, neither SH nor SV gives rise to a 
volume change. 

The components of the displacement vector are found by 
grouping the components from Eqn 4: 



P-SV waves have no effect on the SH waves, and vice versa, 
so there is no coupling between P-SV waves and SH waves. 
However, P waves and SV waves are coupled, because both 
affect the same components of displacement and traction. Thus 
at interfaces, P waves convert to SV waves, and vice versa, 
whereas SH waves do not convert to either P or SV waves. 

When treating the earth as a horizontally layered medium, 
we assume that P-SV and SH waves propagating between 
any two points are decoupled and can be treated separately. 
The situation is more complicated when dipping interfaces are 
present. P-SV and SH are coupled at a dipping interface if its 
normal is not in the plane of propagation, the vertical plane 
containing the source and the receiver. Thus, for dipping inter¬ 
faces, the waves will be coupled for most pairs of source and 
receiver positions. 

As a result, in most applications we treat the P-SV system of 
propagating waves as distinct from SH. In the last section, we 
saw that P waves are described by the scalar potential that 
satisfies the scalar wave equation (Eqn 2.4.37), whereas the S 
waves are described by the vector potential T satisfying the 
vector wave equation (Eqn 2.4.38). To see that the SV and 
SH potentials each satisfy the vector wave equation separately, 
we substitute Eqn 1 into it: 

v 2 [^(x, z, t )+v xx(x, z, f)] = -V 2- [¥(*, Z) t) 

fi 2 dt 2 

+Vxx(x,z,t)], (8) 


These equations demonstrate that P-SV waves are independent 
of SH waves. The x and z components of displacement depend 
on both the P-wave potential 0 and the 5V-potential y/. Thus 
for waves propagating in the x—z plane, the P and 5 V waves 
form a coupled system, which gives rise to two components 
of displacement. Neither the P nor the 5 V potentials contribute 
to the y component of displacement. Hence SH waves, which 
alone contribute to the y component of displacement, are 
decoupled from P and 5 V waves. 

This coupling and decoupling persists when these waves 
interact with a horizontal interface parallel to the x-y plane. 
The boundary conditions at the interface constrain the dis¬ 
placements and tractions (Section 2.3.10). Because the normal 
to the interface has only a z component, 

n= (0,0,1), nj =S j3 , (6) 

the tractions on the interface are given by 

T ,= a n n r <V a zz>- < 7 > 

The P-SV system gives rise to nonzero components of dis¬ 
placement u x and u z , and hence tractions g xz and o zz . For these 
waves, both u = 0 and o yz = 0. By contrast, the SH waves 
contribute only a y component of displacement, and their only 
nonzero traction component is a yz . Thus, at the interface, the 


and regroup the terms: 


V 2x F(x, z, t) 


1 3 2, P(x, z, t) 


-V 2 [V xx(x, z, £)] + — ZTT [ v x X(x, z, £)], 


so the two potentials can be treated separately. Thus the P-5V 
system is described by 

a 2 dt 2 p 2 dt 2 


Both of these are scalar wave equations, because y/is the scalar 
function forming the y component of the 5V vector potential 
(Eqn 3). 

For SH waves we have two choices. Interchanging the curl 
and the other derivatives in the right side of Eqn 9 shows that 
the scalar function %, the y component of the SH vector poten¬ 
tial, satisfies a scalar wave equation. Alternatively, we can take 
the curl and recognize that by Eqns 4 and 5 


u=VxVxx(x, z, f), 



Surface 



Thus the SH- wave displacement satisfies a scalar wave equa- 
on and can be found without using the SH potential. 




2.5.3 Angle of incidence and apparent velocity 

We now consider P~SV waves propagating in the x-z plane 
that are described by harmonic plane wave solutions of the 
scalar wave equations (10), 


(P) (j}(x,z y t) = Aexp(i{o)t-k x x±k Za z)) (13) 

(SV) z,t) = B exp {i{cat- k x x ± k z z)). 

The direction of wave propagation is described by the wave 
vector, which is the normal to the wave fronts. For pro¬ 
pagation in the x-z plane, the direction is given by k x and k z 
because k y is zero. Thus Eqn 13 represents waves propagating 
in the +x direction (because of the negative sign in -k x x), and 
in both the +£ and -z directions. 

Subscripts on k and k z are needed because the magnitude of 
the wave vector differs for P and SV waves. We will see shortly 
that in this geometry k x is the same for the P and the SV waves. 
The components of the wave vectors satisfy 

2 = kl + k 2 z =co 2 la 2 \k lj \ 1 = k 1 x + k 1 z =a> 1 ip 2 . (14) 

Because k y = 0 , k x is the horizontal component of the wave 
vector. 

The direction of propagation can also be expressed by the 
angle of incidence that the wave vector makes with the vertical 
(Fig. 2.5-3). Because the wave vectors, and therefore incidence 
angles, differ for P and S waves, we adopt the convention that 
i refers to P-wave incidence angles and j to S-wave incidence 
angles. Thus 


(ki+kir \k 


(ki + ki / 2 


We will see shortly that plane waves change direction when 
they cross an interface into a material with different seismic 
velocity (Fig. 2.5-4), so the orientation of the wave vector and 
the angle of incidence change. Hence the propagation of a 
plane wave is characterized by the changing orientations of the 
wave vector. We thus speak of a seismic ray that follows this 
ray path . Figures like Fig. 2.5-4 are often drawn showing only 
the ray paths and omitting the wave fronts that are normal to 
the ray. 

It is useful to define the apparent velocity , c x , the velocity 
at which a plane wave appears to travel along a horizontal 
surface. Figure 2.5-3 shows that in a time At a plane wave with 
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Fig. 2.5-3 The wave vector, k, is normal to the wave front and points in 
the direction of propagation. Top : For a plane wave traveling in the 
x-z plane, the propagation direction is given by the wave vector (k x , k z ) 
or the incidence angle, i, between the wave vector and the vertical. In a 
time increment At the wave front moves a distance vAt, where v is the 
medium velocity, and sweeps out a distance along the surface c x At, where 
c x is the apparent velocity along the surface. Middle : For a plane wave 
traveling vertically, the incidence angle i = 0°, k equals k z , and c x is infinite. 
Bottom: For a plane wave propagating horizontally, i - 90°, k equals 
k x , and c x equals the medium velocity. 

incidence angle i in a medium with velocity v moves forward a 
distance vAt and moves across the horizontal surface a distance 
c x At. Thus the horizontal apparent velocity is 

c x = v/sini. ( 16 ) 

The apparent velocity is always greater than or equal to the 
medium velocity, a for P waves and /3 for 5 waves. A horizont¬ 
ally propagating wave, with i = 90°, has an apparent velocity 
equal to the medium velocity. A vertically incident plane wave 
arrives everywhere on the surface at the same time, so it has an 
infinite apparent velocity. 

The horizontal apparent velocity 1 can be written in terms of 
the horizontal component of the wave vector using Eqns 15 
and 16 : 

1 Because seismological observations are made at the earth’s surface, the apparent 
velocity along the earth’s surface is sometimes written as c rather than c x , and k is 
sometimes used to denote k . 
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the orientation of the wave vector k, or by a ray path showing successive orientations of the wave vector. The wave fronts, which are often not shown, 


are normal to the ray path. 



Fig. 2.5-5 Snell’s law for plane waves propagating into a higher-velocity medium. Left: An incoming P wave generates transmitted and reflected P and 
5 V waves. The reflected P wave has the same incidence angle, i v as the incoming P wave. Because in each medium the P-wave velocity exceeds the S-wave 
velocity, j 1 < i 1 and j 2 < i 2 - Right : The same situation for an incoming SV wave. The incidence angles of the incoming and reflected SV waves, j v are equal. 
The relationships between the other incidence angles are the same as for an incident P wave. 


c x =co/k x . (17) 

Thus we define the ratios of vertical to horizontal wave- 
numbers as 

r a= KJK = -1 ) 1/2 = cot i, 

r p =k z Jk x = (c 1 x ip 1 -\) m = cotj, (18) 

so that the potentials (Eqn 13) can be written 
(?) (j){x, z, t) = A exp (i(cot- k x x±k x r a z)) 

{SV) y/(x,z,t) = Bexp (i((Qt-k x x±k x rpz)). (19) 

2.5.4 Snell's law 

We now consider the relation between the angles of incidence 
for transmitted and reflected harmonic plane P-SV waves 
at an interface. In the geometry of Fig. 2.5-5, an interface at 


z = 0 separates medium 1 with P and 5 velocities cq and p 1 from 
medium 2 that has velocities a 2 and fi 2 . We first assume that 
cq < ^2 and < fi 2 . 

A ? wave incident from medium 1 generates reflected and 
transmitted ? waves. In addition, part of the ? wave is con¬ 
verted into a reflected SV wave and a transmitted 5V wave. 
Each of these waves can be described by an appropriate poten¬ 
tial. In medium 1 we have upgoing and downgoing P waves 
and an upgoing SV wave, so the potentials are 

(p(x, z, t) = incident P + reflected P 

= Aj exp (i(cot - k x x - k x r z)) 

+ A 2 exp (i{( 0 t- k x x + k x r z)) 

\j/{x, z, t) = reflected 5 V = B 2 exp (i(m - k x x + k x r^ z)). (20) 

The form of each potential describes the wave. Terms like k x r a ^ 
the z component of the wavenumbers, indicate which medium 
(1 or 2) and what wave type (? or S) this potential describes. 








'The direction of propagation for each wave is given by the 
components of the wave vector k. For example, the signs of 
:, v , /z an d k x r ai terms show that the incoming P wave with 
amplitude travels in the +x and +2 directions as time in- 
'■'V-i'ses. Similarly, the reflected P wave with amplitude A 2 and 
,-hc reflected SV wave with amplitude B 2 travel in the +x and 

directions. 

The downgoing P wave and *SV waves in the second 
medium are given by the potentials 

0 !x, z,- t) = transmitted P = A' exp (/( cat - k x x - k x r a z )) 

^(.rj z, t) — transmitted SV=B' exp [i{(Dt-k x x-k x r^z)). (21) 

A' and B' are the amplitudes of the transmitted P and SV waves, 
which travel in the +x and + 2 : directions. We generally write 
rhe amplitudes of P waves as A and the amplitudes of S waves 
as B. 

We can find the incidence angles of the transmitted and 
reflected waves from the incidence angle of the incoming wave. 
Ihe boundary conditions for the solid-solid interface at z = 0 
are that the components of the displacement and traction 
vectors are continuous (Section 2.3.10). Because all of the 
potentials contain the phase factor, exp {i(cot - k x x)) times a 
factor independent of x and t, all of the displacement and trac- 
rio.n components have this phase factor. For the displacement 
and traction to be continuous at the interface for all x and all t , 
[cot - k x x) must be equal for each of the potentials. Thus the 
horizontal wavenumber k x , and hence the apparent velocity 
along the interface = a/k xi must be the same for each wave. 
As a result, the waves travel along the interface at the same 
speed and stay in phase. 

This condition and the definition of c x (Eqn 16) give the 
familiar form of Snell’s law : 


sm i 1 sin q sin i 2 sm ; 2 


the ratio of the sine of the angle of incidence for each wave to 
the corresponding velocity is constant. Flence the incident and 
reflected P waves have the same incidence angle i v The trans¬ 
mitted P and S waves change direction by a factor depending on 
the velocities in the two media. A change in direction upon 
transmission into a medium with a different velocity is called 
refraction , so the waves in the second medium are called 
refracted or transmitted waves. Figure 2.5-5 illustrates the ray 
paths for the different waves. 

The S wave reflected from the boundary satisfies 

sin q = sin ql/^/cq). (23) 

Because in any medium P waves travel faster than S waves, 
Snell’s law requires that j 1 < q. Flence the reflected S ray is 
closer to the vertical, or further from the interface, than the P 
ray in the same medium. Physically, this is because the S wave 


1 


must be closer to the vertical than the P wave to have the same 
apparent velocity along the interface. 

The angle of incidence for the refracted P wave is related to 
that for the incident P wave by 

sin i 2 = sin q(a 2 /cq). (24) 

If the second medium has a higher velocity, then i 2 > q, so the 
transmitted ray is further from the vertical than the incident 
ray. It travels more horizontally, so the apparent velocities 
along the interface are equal. On the other hand, if cq > a 2 , 
then the refracted P wave would be closer to normal incidence. 
(This effect, for light waves, makes a pencil appear to bend at 
the surface of a glass of water.) 

The transmitted S wave satisfies 

sin; 2 = sin q(j8 2 /oq). (25) 

Hence for p 2 > /5 V we get ; 2 > q, so the transmitted S wave 
is more nearly horizontal than the reflected S wave. Similar 
relations apply for an incident SV wave (Fig. 2.5-5). The 
reflected P ray is bent further from the normal than the incid¬ 
ent or reflected SV rays. 

The fact that an incident P wave generates both P and SV 
waves, and vice versa, is a consequence of the displacement 
and traction boundary conditions at the interface, as we will 
see in Section 2.6. Some insight into why this should be can be 
obtained by considering Fig. 2.5-6, in which an incident SV 
wave disturbs the boundary, which then generates P waves in 
addition to the transmitted and reflected SV waves. 

2.5.5 Critical angle 

When a P wave impinges on a horizontal boundary, Eqn 24 
shows that the incidence angle for the transmitted P wave in the 
second medium is 

i 2 = sin -1 [sin q(a 2 /cq)], (26) 

where the notation sin -1 indicates the inverse sine function. 
If the second medium has a higher velocity, the transmitted 
P ray is further from the vertical than the incident ray. As the 
angle of incidence increases, the transmitted ray approaches 
the horizontal interface (Fig. 2.5-7, overleaf). Eventually, the 
incidence angle q reaches a value q where i 2 - 90° and the argu¬ 
ment of the sin -1 term becomes 1, so 

sin i c [a 2 Iaf) - 1 or sin i c = cq/ct 2 . (27) 

Thus for a wave incident at this critical angle of incidence, the 
transmitted wave grazes the interface. 

Once the incidence angle exceeds the critical angle, which is a 
situation called postcritical incidence, no transmitted plane 
wave exists in the second medium. This phenomenon is some¬ 
times called total internal reflection. In this case, as we will see 
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Fig. 2.5-6 Cartoon demonstrating how an SV wave (shown by the 
light grey wave front) incident at a boundary generates reflected and 
transmitted P (dark grey wave front) and 5T waves, for the case shown 
in the bottom half of Fig. 2.5-5. a: The incident 5 V wave disturbs the 
boundary, b: The displaced boundary generates reflected and transmitted 
P and S V waves, c: As the incident SV wave advances, its intersection with 
the boundary moves, continuously generating reflected and transmitted 



Fig. 2.5-7 Illustration of the critical angle i c for P waves incident on a 
faster medium. The transmitted S and the reflected P and S waves are not 
shown. As the angle of incidence increases, the incoming waves become 
more nearly horizontal, and the refracted P waves approach the interface. 
For waves incident at an angle exceeding (more horizontal than) the 
critical angle, no traveling P wave is transmitted into medium 2. 


in the next section, the P-wave potential for the second medium 
has a ^-dependent real exponential term, exp (—k z z), instead of 
a purely imaginary exponential term, exp ( -ik z z ). Hence the 
displacement in the second medium is not a propagating plane 
wave, but occurs as an evanescent wave that travels along the 
interface and decays away from the interface. 

Although for angles of incidence beyond the critical angle 
there is no transmitted P wave, there can still be a transmitted 
S wave. If the S velocity in medium 2 is greater than the P velo¬ 
city in medium 1 there is a second critical angle 

sin i c ^ - a 1 lp 2 (28) 

beyond which no transmitted P or S waves occur, 

2.5.6 Snell's law for SH waves 

Snell’s law also applies to SH waves. Because for SH waves the 
displacement satisfies the wave equation, SH waves in the first 
medium are described by 

u y (x, z, t) = B 1 exp (i((ot-k x x-k x rpz)) 

+ B 2 exp {i(cot- k x x + k x %z)), (29) 

where P a and B 2 are the amplitudes of the incoming and 
reflected SH waves (Fig. 2.5-8). In the second medium, the 
transmitted SH wave is 

u y (x, Z, t) = B' exp (i(mt - k x x - k x r p z)). (30) 

As before, Snell’s law 

c x = /y sin ; 1 = /3 2 /sin j 2 (31) 


waves. 










Fig. 2.5-8 An SH wave propagating in the x-z plane creates only 
transmitted and reflected SH waves when incident on a solid-solid 
interface in the x-y plane. The incident and reflected waves have the 
same incidence angle, j v For fi 2 > f$ v 
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Fig. 2.5-9 Geometric interpretation of P-wave propagation in terms of the 
relation between the angle of incidence, i, the wave vector, k a , the 
slowness vector, s, the ray parameter or horizontal slowness, p, and the 
vertical slowness, rj a . 


applies, because ( cot-k x x ) must be equal for all three waves for 
the traction and displacement to be continuous at the interface. 
The critical angle for SH waves is thus 

sin j c = P 1 lp 2 - ( 32 ) 

2.5. 7 Ray parameter and slowness 

A useful way to characterize a wave’s ray path is via its ray 
parameter, p , the reciprocal of the horizontal apparent velocity, 

p = l/c x = sin i/v = k x lco, (33) 

where i is the incidence angle of either a P or an S wave, and v 
is the corresponding velocity. The harmonic plane wave solu¬ 
tion can be written in terms of the ray parameter. To illustrate 
this, consider the potential for a P wave propagating in the 
x-z plane, and factor out the angular frequency: 

exp (i{mt-k x x-k x r a z)) = ex p {io){t- (k x /o))x - {k x /co)r a z)) 

= exp {m{t-px-T] a z)) 

= exp {ico{t-s • x)). (34) 

Here we define the slowness vector , 

S = (p, r\ a ), (35) 

whose components are the ray parameter p and r\ a - {k x i(d)r a = 
P r a =r a /c x = ( 1/a2 “ P 2 ) 1 ' 1 ' 

We can interpret T] a geometrically using the components of 
the wave vector, because by Eqn 18 r a = k z Jk x , so 

n tt =k z Jm=k z J(\ k tt |a) = cos i!a. (36) 

r\ a and the ray parameter p are closely related because both are 
functions of the angle of incidence divided by the velocity. 
Hence the magnitude of the slowness vector is 


| s | ~{p 2 + t] 2 ) 1/2 = (sin 2 i/a 2 + cos 2 i/a 1 ) 111 = 1/a. (37) 

Thus the reciprocal of the velocity, 1/a, is called the scalar 
slowness , an apt term because a low-velocity medium is very 
slow (has a high slowness), whereas a fast-velocity medium has 
low slowness. The slowness vector (Fig. 2.5-9) is directed along 
the ray (parallel to the wave vector) with a magnitude equal 
to the slowness, and can be written s = k a /a. Its compon¬ 
ents are the ray parameter p , also called horizontal slowness , 
and rj a , called the vertical slowness. Similarly, for S waves the 
slowness is 

s = {p,rip) = kp/p, 

rfp= (l//3 2 -/? 2 ) 1/2 = cos j/j3=prp = rp/c x . (38) 

Writing a harmonic plane wave in terms of slowness gives 
several insights. In the argument of the exponential in Eqn 34 
{icoit - s-x)), the slowness term, s • x, has the dimension 
of time, and shows the net travel time due to the vertical and 
horizontal propagation times, each of which is described by 
the corresponding component of the slowness. The slowness 
formulation also gives another view of Snell’s law. We derived 
Snell’s law by considering a harmonic plane wave incident 
on a horizontal interface and the resulting reflected and trans¬ 
mitted plane waves. The horizontal component of the wave 
vectors k x , and hence the horizontal apparent velocity were 
continuous at the interface. By contrast, the terms related to 
the vertical component of the wave vectors like k z = k x r a varied 
between layers and for P and S waves. The corresponding 
formulation in terms of slowness says that the ray parameter or 
horizontal slowness p is the same for the incident, reflected, and 
transmitted waves, whereas the vertical slowness depends on the 
medium and the wave type. Snell’s law can thus be stated as: p 
is constant for a ray and any rays that it produces at interfaces. 

An important application of the ray parameter is in describ¬ 
ing the evolution of a ray that encounters a number of inter¬ 
faces (Fig. 2.5-10). Each of the four rays generated at the first 
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Fig. 2.5-10 A P wave incident on a stack of flat layers generates four 
waves, two reflected and two transmitted, at each interface. Each of these 
waves generates four more at each interface, and so on. All these waves 
have the same ray parameter, so their paths can be traced by applying 
Snell’s law at each interface. 


interface in turn generates another four rays at the next inter¬ 
face, and so on. Because Snell’s law applies at each interface, all 
these rays have the same ray parameter. As a result, p is 
constant along any ray path, no matter how many transmis¬ 
sions, reflections, or conversions the ray has undergone. This 
gives a way of tracing the ray path for a ray that began its 
travels with a certain ray parameter. In doing this on a com¬ 
puter, an advantage of the ray parameter is that it is zero for a 
vertically incident wave, whereas c x is infinite. 

2.5.8 Waveguides 

Snell’s law is one of seismology’s most important tools, 
because seismic waves encounter variations in velocity due to 
changes in the physical properties of the materials, including 
the effects of composition, temperature, and pressure. In gen¬ 
eral, the velocity increases with depth, so seismic waves turn 
toward the horizontal as they go deeper. Eventually the ray 
“bottoms,” turns upward, and reaches the surface (Fig. 1.1-3). 
Such ray paths can be modeled using Snell’s law, either with 
many layers or with a version (Section 3.4) accommodating 
velocities that vary smoothly with depth and so give smooth 
ray paths. The ray path and the travel time along it thus provide 
information about the distribution of seismic velocities and 
physical properties with depth. 

However, in some regions velocity decreases with depth, 
yielding a low-velocity medium between higher-velocity media 
(Fig. 2.5-11, top). If seismic waves are generated in the low- 
velocity medium, then total internal reflection will trap much 
of the seismic energy in the low-velocity channel, which acts 
as a waveguide. 1 One such waveguide occurs in the oceans, 
because the speed of sound in seawater is proportional to both 
temperature and pressure. The combination of temperature 
decreasing with depth and pressure increasing with depth 

2 Similarly, fiber optic cables transmit light signals by trapping them in a low- 
velocity material surrounded by high-velocity materials. 



Fig. 2.5-11 Top: A low-velocity layer surrounded by high-velocity 
material acts as a waveguide. Rays incident on either interface at angles 
exceeding the critical angle undergo total internal reflection. Bottom: 
The SOFAR channel, a low-velocity zone {right) in the ocean, acts as a 
waveguide, as shown by ray paths from a source in the channel (left). 
Note the non-SI units for distance and velocity. {Ewing et ai, 1957) 


produces a low-velocity region known as the SOFAR (SOund 
Fixing And Ranging) channel at a depth of -1000 meters. Rays 
leaving a source in the channel at angles up to ±12° from the 
horizontal are internally reflected (Fig. 2.5-11, bottom). The 
ray paths are curved because of the smooth velocity structure. 
The SOFAR channel transmits sound very efficiently, allowing 
explosions, submarines, and whales to be detected at great 
distances. As a result, the speed of sound waves in the channel 
is being used to search for changes in ocean temperature that 
may be due to global warming. Similarly, earthquakes can be 
studied using seismic waves in the SOFAR channel that cause 
arrivals called T waves (Fig. 2.5-12, top), that can be detected 
by hydrophones in the water, or by seismometers when a T 
wave hits land. The ringing quality of T waves (Fig. 2.5-12, 
bottom) is due to the internal reflections within the SOFAR 
channel. Waveguides are also associated with fault zones due 
to their low velocities relative to the surrounding rocks. 

2.5.9 Fermafs principle and geometric ray theory 

As our discussions so far show, we can gain insight into the beha¬ 
vior of seismic waves by considering the ray paths associated 





T wave 


SOFAR channel 


Fig. 2.5-12 Top: A P wave generated by an 
earthquake and reflected between the ocean 
floor and surface is trapped in the SOFAR 
channel and propagates as a T wave. 
Bottom : T waves recorded in Tahiti from 
an earthquake in Tonga. The amplitudes 
exceeded the gain on the seismometers, 
causing them to clip at the top and bottom. 
The high-frequency ringing of the T waves 
distinguishes them from body and surface 
waves. (Talandier and Okal, 1979. 

© Seismological Society of America. 
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with them. This approach, studying wave propagation using 
ray paths, is called geometric ray theory . Although it does not 
fully describe important aspects of wave propagation, it is 
widely used because it often greatly simplifies the analysis and 
gives the correct answer or a good approximation. 

The most obvious application of rays is for computing travel 
times. To find when a plane wave generated at one position will 
arrive at another, we use the travel time, which is the length 
of the ray path divided by the velocity. Thus, if waves follow 
complicated paths, their travel time is the sum of the travel 
times for each portion of the ray path. The travel time for a ray 
that has traveled through several media, sometimes as a P wave 
and sometimes as an S wave, is found using the appropriate 
path length and velocity for each segment. 

The concept underlying this approach is Fermat’s principle, 
a famous result from optics, the study of light. Fermat’s prin¬ 
ciple states that the ray paths between two points are those 
for which the travel time is an extremum, a minimum or max¬ 
imum, with respect to the nearby possible paths. The simplest 
case is two points in a homogeneous halfspace; the time needed 
to traverse the straight line connecting the points is less than 
for adjacent paths (Fig. 2.5-13). A second ray path for which 
the time is a minimum compared to adjacent paths is that of the 
reflected ray satisfying Snell’s law. The direct ray path cor¬ 
responds to an absolute minimum of the travel time, whereas 
the reflected ray corresponds to a local minimum. 

Snell’s law can be derived from Fermat’s principle. Consider 
the possible ray paths (Fig. 2.5-14) between the point (0, a) in 
medium 1, with velocity and the point ( b , —c) in medium 2, 
with velocity v 2 . The ray paths can be parametrized by the 
point (x, 0) where they cross the interface. The travel time as 
a function of x is 



Source 


Fig. 2.5-13 Two ray paths (solid lines), one for the direct ray and 
one for the reflection obeying Snell’s law, connecting two points in a 
homogeneous halfspace. The travel time for these paths is less than 
for nearby paths (dashed), in accord with Fermat’s principle. 



Fig. 2.5-14 Derivation of Snell’s law for refraction using Fermat’s 
principle. The ray path between points on opposite sides of the interface 
is that for which the travel time is a minimum. 
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In most seismological applications the ray paths and travel 
times derived using Snell’s law yield results in reasonable 
accord with observations, because most seismic energy propag¬ 
ates as though it followed ray paths. However, geometric ray 
theory is only an approximation to the solutions of the elastic 
equation of motion that describes the generation and propaga¬ 
tion of seismic energy. As a result, ray theory has two major 
limitations. First, it does not directly provide information 
about wave amplitudes. Hence, although deriving Snell’s law 
using ray theory gives the angles of the reflected and transmit¬ 
ted waves, we need wave theory to find their amplitudes. In 
some cases, this limitation can be circumvented by tracing rays 
from a source and using the resulting density of rays to infer 
amplitudes (Sections 2.8.4, 3.4.2, 3.7.3). Second, in other 
applications, as discussed next, geometric rays fail to describe 
the wave’s behaviour. 


2.5.10 Huygens' principle and diffraction 

In some applications treating propagating waves as geomet¬ 
ric rays fails to explain what we observe. For example, waves 
bend or diffract around the earth’s core and so reach places 
to which Snell’s law predicts no ray path. Similarly, although 
ray theory says that no energy is transmitted when a wave is 
incident on an interface at an angle greater than the critical 
angle, some energy is in fact transmitted. Addressing such 
issues requires explicitly considering the fact that seismic 
energy propagates as waves. To do this, we draw on results 
from both seismology and other wave phenomena, especially 
light waves, which are easier to study and have been invest¬ 
igated for many years. 

One important approach, known as Huygens' principle , is 
illustrated in Fig. 2.5-15. Each point on a wave front is consid¬ 
ered to be a Huygens' source that gives rise to another circular 
wave front. These wave fronts interfere constructively to give 
a circular wave front, and interfere destructively everywhere 
else. In three dimensions, the wave fronts are spherical. 



Fig. 2.5-15 Figure adapted from Huygens’ original (1690) analysis 
showing how circular wave fronts can be generated by treating each point 
on the initial wave front as a point source of wave energy. (Reprinted from 
Huygens, Treatise on Light , trans, S. P. Thompson (Dover, New York).) 



Fig. 2.5-16 Demonstration of Huygens’ principle for the propagation 
of a straight wave front. Successive wave fronts are generated by drawing 
a circular wave from each point on the previous wave front and then 
drawing a line tangent to the circles. The circular wave fronts are assumed 
to interfere destructively everywhere else. 


Although the point sources, known also as diffractors or 
scatterers, need not have a physical interpretation, in some 
cases they do. For example, heterogeneities in the crust and 
mantle scatter incident seismic waves. Hence, migration 
methods in exploration seismology (Section 3.3.7) improve 
images of the subsurface by undoing this scattering. Similarly, 
seismic energy that arrives before PKP waves that traverse the 
earth’s core is thought to have been scattered by heterogeneities 
in the mantle. 

Huygens’ principle gives another way of thinking about 
phenomena we have discussed. It explains why a straight wave 
front generates subsequent straight wave fronts, as shown in 
Fig. 2.5-16. It is also another way of deriving Snell’s law. 
Assume, as in Fig. 2.5-17, that a wave front A-A' in medium 1 
is incident upon a boundary with medium 2. When the wave 
front reaches point A, energy begins to radiate outward, but if 
the velocity in the second medium is less, the radius of the cir¬ 
cular wave front some time later is smaller in medium 2. Sim¬ 
ilarly, as the wave front reaches other points along the interface 
(for example, point B), circular wave fronts of different sizes 
spread out in the two media. By the time the initial wave front 




Fig. 2.5-17 Derivation of Snell’s law using 
Huygens’ principle. As an incident plane 
wave A-A' interacts with the boundary, the 
Huygens’ sources combine to form a reflected 
wave front C-C' and a transmitted wave front 
C -D. Because the radii of the circular wave 
fronts are proportional to the velocity in each 
medium, the angles of the incident { O-A ), 
reflected (A-C'), and transmitted (A-D) 
rays yield Snell’s law. 


reaches point C, one planar wave front, drawn as the tangent slit has width d, then waves from either side of the slit will be 

to the circular wave fronts in medium 1, is the reflected wave, out of phase by 90° and so interfere 3 destructively at distance 

and another gives the refracted wave. The directions of the D when the path difference is a half wavelength. Hence the 

waves, taken as the perpendiculars to the planar wave fronts, amplitude will be zero at a distance x 0 , or an angle 0, from the 

are those expected from Snell’s law. Thus we have three ways middle of the slit. By this condition 

of understanding how Snell’s law comes about: Huygens’ prin¬ 
ciple, Fermat’s principle (Section 2.5.9), and the application of X/2 = d sin 0~ dx 0 /D , (42) 

the interface boundary conditions to plane waves (Section 

2.5.4). Each approach offers different insight into the phenom- assuming D » d. Thus the amplitude decays from its max- 

enon of reflection and refraction. imum at 0 = 0 to zero at x 0 = XDtld. A more sophisticated 

Huygens’ principle also explains the phenomenon of dif- analysis 4 shows that the amplitude varies as 
fraction , in which waves bend around obstacles. Although the 

phenomenon is complicated, the simple example of diffraction (sin f )/f, where if-lTtdx/XD, (43) 

at a slit (Fig. 2.5-18, top) gives considerable insight. We assume 

that an incident planar wave front acts like a set of Huygens’ which is shown in Fig. 2.5-18 {bottom). This function has 

sources, so the transmitted wave field is the superposition of a central lobe of width 2x 0 and a series of decreasing side lobes, 

waves from these sources. In front of the slit, the sources The slit illustrates general properties of diffraction, because 
combine to give a planar transmitted wave front. In addition, diffraction around an obstacle is in many ways similar. An 

energy propagates to the sides, and thus can be detected around important point is that diffraction depends on the wavelength, 

the corners, although there is no geometric ray path to there. so longer wavelengths have broader lobes and thus are more 

The analogous process occurs with shear waves that cannot 
pass through the liquid outer core, and so diffract around it 

(Section 3.5.2). 3 Interference and diffraction are terms for closely related wave phenomena between 

which there is no sharp distinction. Effects involving a few sources are typically called 
- . . interference, whereas those involving many sources are often called diffraction, 

requires going beyond Huygens principle, a simple construe- 4 analysis uses Fourier transforms, and so yields the (sin £)/£ function that 
tion (Fig. 2.5-18, middle) shows some important aspects. If the commonly appears in Fourier analysis, as we will see in Section 6.3. 


Although evaluating the amplitude of the diffracted waves 
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Fig. 2.5-18 Top : Use of Huygens’ sources to describe waves diffracting at 
a slit. Energy diffracts around the corners to reach areas with no geometric 
ray paths leading to them. (Klein and Furtak, 1986. Copyright © 1986. 
Reprinted by permission of John Wiley & Sons, Inc.) Middle : Geometry 
for the analysis of diffraction by a slit of width d, observed at a distance D. 
Bottom : The (sin ^/{'function describing the amplitude of the diffracted 
wave, showing the central lobe and side lobes. 


affected by diffraction. For example, we can hear around open 
doorways but not see around them, because sound has a wave¬ 
length of about 0.1 m, compared to 10" 7 m for visible light. 
Similarly, seismic waves that diffract around the core lose their 
high-frequency components. Hence the longer the wavelength, 
the poorer an approximation geometric ray theory becomes. 

Specifically, the diffraction depends on the ratio of the wave¬ 
length to the slit width. If the slit is less than a half wavelength 
wide, the side lobes vanish. Hence, if an obstacle is less than 
half a wavelength wide, waves impinging on it are insensitive to 
the details of its structure. Conversely, if the slit is very wide 
compared to the wavelength, diffraction occurs only at the slit’s 
edges. Thus, for example, seismic reflection images show waves 
that diffracted around the ends of interfaces (Section 3.3.7). 

Similar effects occur when wave fronts encounter a circular 
(or spherical) obstacle (Fig. 2.5-19a). Geometric ray theory 
predicts that no energy will arrive behind the obstacle, so a hole 
in the wave front will develop and never close. In reality, the 
wave diffracts around the sphere, closing the gap behind it. The 
successive wave fronts illustrate why it is difficult to seismically 
observe an obstacle or a low-velocity zone. As the wave fronts 
continue after passing the sphere, the break in the wave front 
fills in with energy from either side until at large distances the 
delay from the obstacle is no longer observable. This process, 
called waveform annealing , also occurs if the obstacle has a 
lower velocity (Fig. 2.5-19b), so much of the energy arriving 
behind the obstacle diffracts around the obstacle rather than 
passing slowly through it. This effect can also be interpreted 
using Fermat’s principle, because the resulting wave is that 
which traveled for the least time. 

This example illustrates one possible reason why it has 
proved very difficult to seismologically observe plumes, up- 
wellings from deep in the mantle that have been proposed 
to give rise to island chains like Hawaii. A seismic wave 
front encountering a narrow conduit of hot, slow rock 
diffracts around it, causing little travel time delay. By contrast, 
anomalously fast rock is easy to “see” seismologically. Hence 
seismology is very good at detecting subducting lithosphere at 
trenches (Section 5.4), because the cold material has a higher 
seismic velocity. This effect is illustrated by Fig. 2.5-19c, which 
shows a spherical anomaly faster than the surrounding 
material. By Fermat’s principle, the anomaly is the fastest path 
between a source and a receiver. From the Huygens’ principle 
view, the wave front moves further ahead through the fast 
material, and then spreads out laterally, advancing the rest of 
the wave front. The waves thus lose their planar appearance 
and appear to have emanated from a point source. 

These analyses show that Huygens’ principle describes the 
general features of diffraction. However, it does not provide 
direct information about amplitudes. For instance, although 
the wave fronts in Fig. 2.5-19 lose amplitude as they diffract 
around the sphere, this decay cannot be obtained from 
Huygens’ principle. To go further requires an extension of 
Huygens’ principle known as the Kirchhoff integral, which is 
beyond our scope. 
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Fig. 2.5-19 Waves interacting with a 
spherical anomaly, a: A straight wave front 
diffracts around a circular or spherical 
obstacle, as described by Huygens’ principle. 
Only the leading wave front is shown. This 
formulation shows the locations of the wave 
fronts, but not their amplitudes, b: Plane 
waves interacting with a low-velocity 
anomaly 30% slower than the surrounding 
material. The waves slow within the 
anomaly and diffract around it. After passing 
the obstacle, the wave front shows little 
perturbation, illustrating the difficulty 
of seismically observing low-velocity 
anomalies, c: Plane waves interacting with an 
anomaly 50% faster than the surrounding 
material. The overall speed of the wave field 
increases, demonstrating that seismically fast 
anomalies are easy to observe. 


a. Spherical obstacle 




c. Spherical fast anomaly 



2.6 Plane wave reflection and 
transmission coefficients 

2.6.1 Introduction 

Seismic waves propagating in the earth encounter several types 
of interface (Fig. 2.6-1) at which physical properties change 
over short distances. For example, the earth’s surface is a free 
surface, and the sea floor is a liquid-solid interface. Variations 
in velocity and density cause solid-solid interfaces such as the 
Mohorovicit discontinuity , or Moho , separating the crust and 
the mantle (Section 3.2). The upper and lower mantles are 
divided by regions of rapid velocity changes (Section 3.5), 
which can be described for many purposes as solid-solid inter¬ 
faces. The core-mantle boundary is an interface between the 
solid mantle and fluid outer core, and the base of the outer core 
is an interface with the solid inner core. Nearly all our know¬ 
ledge of these interfaces comes from observing their effects 
on seismic wave propagation. 

In the last section we derived Snell’s law, relating the bend¬ 
ing of waves at an interface to the velocity contrast across it. 
We now discuss the amplitudes of the reflected and transmitted 
waves. We first consider two simple cases, SH waves at a 
boundary and P-SV waves at a free surface, and then outline 
how the same approach is applied for P-SV waves at an inter¬ 
face between solids. It turns out that although the angles of 
reflection and transmission, and hence the ray paths and travel 
times, depend only on the velocities, the amplitudes depend on 
the elastic constants in a more complicated way. As a result, 
the amplitudes of waves provide information beyond that 
conveyed by travel times, and so are valuable for studying the 
earth’s interior. 



Fig. 2.6-1 Illustration (not to scale) of some of the interfaces within the 
earth that affect seismic waves. 
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Fig. 2.6-2 Geometry for an 5H wave in medium 1 incident on a solid-solid 
interface with medium 2. B t , B 2 , and jB' are the amplitudes of the incident, 
reflected, and transmitted SH waves. The displacement is in the y direction. 


2.6.2 SH reflection and transmission coefficients 

We first consider the amplitudes of SH waves reflected and 
transmitted at a horizontal interface. Figure 2.6-2 illustrates 
the geometry of an SH wave propagating in the x-z plane incid¬ 
ent on a boundary in the x-y plane between media with shear 
velocities, rigidities, and densities /3 ; -, and p-. For SH waves, 
the only nonzero component of displacement, u , satisfies the 
wave equation (Eqn 2.5.12), so we write the displacements for 
harmonic plane waves on either side of the boundary. Because 
z is defined positive downward, exponentials with -k x Tp,z 
represent downgoing waves in medium z, and those with 
+ k x r p.z represent upgoing waves. In medium 1 (z < 0) there is a 
downgoing incident wave with amplitude B 1 and an upgoing 
reflected wave with amplitude J3 2 , 


u y (x, z,t) = B 1 exp (i(mt- k x x - k x r^z)) 

+ B 2 exp (i{(Ot- k x x + k x r^z)). 


(1) 


In medium 2 (z > 0) there is only a transmitted wave with 
amplitude B', 


](x 9 z, *) = B'exp (i(m-k x x-k x rpz)). 


( 2 ) 


To find the amplitudes, we use the solid-solid interface con¬ 
ditions (Section 2.3.10) that the displacement and traction are 
continuous on the boundary z = 0 for all x and t. The continuity 
of displacement requires that 


u y (x, 0, t) = u+(x, 0, t) 

(B 1 +B 2 ) exp (i(cot~ k x x)) = B / ex p {i(cot-k x x)). 


( 3 ) 


The other condition comes from the requirement that the 
traction vector, T i = o^w-, be continuous. Because the unit 
normal vector for the interface is (0, 0, 1), the stress compon¬ 
ents a xz , <J yz , a zz are continuous. For SH waves u x and u z are 
zero, so or = (7 = 0, and <7 is continuous. To use this con- 
dition we substitute 


yz 


^yz = P 


dUy 3 U Z 

dz dy 


= P 




dz 


( 5 ) 


At points infinitesimally above and below the interface 2 = 0, 
the stress satisfies 


ex P {i{a>t-k x x)) 
= -fi 2 ik x TpB' exp (i(cot-k x x)). 


( 6 ) 


Canceling the factors common to both sides gives the second 
condition 


(B 1 -B 2 ) = B'(ii 2 r f5i )l(ti 1 r Pi ). 


(7) 


Solving Eqns 4 and 7 simultaneously yields the amplitudes of 
the reflected and transmitted waves. First, we eliminate B 2 and 
find the transmission coefficient, 


t 

i 12 “ 


2Pi% 


A P\ r fa + P 2 % 


( 8 ) 


the ratio of the amplitude of the transmitted wave in medium 2 
to that of the incident wave in medium 1. Similarly, eliminating 
B' from Eqns 4 and 7 gives the reflection coefficient 


*12 = 


Pi r fa + th% 


(9) 


the ratio of the amplitudes of the reflected and incident waves 
in medium 1. 

The reflection and transmission coefficients depend on the 
angle of incidence because, by Eqn 2.5.38 




( 10 ) 


Hence, using Eqn 10 and recognizing that from the definition 
of the S-wave velocity, /!•=p-/3?> the reflection and transmission 
coefficients can be written 


When deriving Snell’s law, we found that (cot- k x x) is the same 
for all three waves, so we cancel the exponentials and obtain 
one condition on the amplitudes, 


^12 “ 


*12 “ 


2Pift cos/i 


Piftcos/ 1 + p 2 ^ 2 cos/ 2 

Pi A COS/! - p 2 P 2 COS i 2 
Pi Pi cos/ 1 + p 2 A cos j 2 


B t + B 2 = B\ 


(4) 


( 11 ) 
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Thus the reflection and transmission coefficients depend 
on the acoustic impedances p-ft, as did those for waves on a 
string (Section 2.2.3), but with an angle dependence that could 
not occur for a one-dimensional string. If the media are inter¬ 
changed, the reflection coefficient reverses polarity, R 12 = -R 21 , 
and the transmission coefficients satisfy ^12 + ^21 ~ 2* Due to 
the displacement continuity condition (Eqn 3), 1 + R 12 - T 12 . 
Large impedance contrasts favor reflection, whereas small con¬ 
trasts favor transmission. In the limit of identical media there is 
no reflection (R 12 = 0), and everything is transmitted (T 12 = 1). 

An interesting effect occurs for an SH wave incident on the 
earth’s free surface. Because ft = 0, the reflection coefficient 
equals 1 regardless of the incidence angle, so the displace¬ 
ment is twice that of the upgoing wave. This also occurs at 
solid-liquid interfaces, such as the sea floor or the core-mantle 
boundary, which act as free surfaces for SH because no SH 
waves propagate in the liquid. 

The transmission and reflection coefficients have a particu¬ 
larly simple form for vertical incidence (j 1 =/ 2 = 0): 

T - 2 Pi A d _ Pi Pi ~ P2P 2 (17] 

12 P1P1 + P2P2' 12 P1P1 + P2P1' 

These vertical incidence forms are easy to remember and are a 
useful approximation for nonvertical incidence. 

The fact that the transmission and reflection coefficients 
depend on the contrast in both density and velocity, whereas 
the angles made by the waves depend only on velocity, makes 
the amplitudes valuable for studying elastic properties from 
seismological observations. Although each medium has three 
quantities of interest, ft, p f , and p-, only two are independent, 
because the velocities depend on the rigidities and densities. For 
example, if we regard the velocity and rigidity as independent, 
the angles of reflection and transmission give information about 
the velocity, and the amplitudes provide additional informa¬ 
tion about the rigidity. 

2. 6.3 Energy flux for reflected and transmitted SH waves 

In some cases the transmission coefficient exceeds 1. For ex¬ 
ample, when an SH wave impinges on a higher-velocity medium 
at critical incidence, the transmitted wave becomes horizontal 
(; 9 = 90°) and Eqn 11 shows that the transmission coefficient is 
2. As for the string (Section 2.2.4), this puzzling effect can be 
explained by examining how the incident wave energy divides 
between the reflected and transmitted waves. 

We saw (Section 2.4.5) that the flux of energy per unit wave 
front in the propagation direction associated with a harmonic 
SH plane wave u{x, t) = A cos {cot - kx) is the product of the 
energy density and the velocity 

E = A 1 w 1 pfSI2. (13) 

Because no energy accumulates at an interface, the flux of 
energy in the length of wave front incident on an element dx of 



Fig. 2.6-3 The lengths of the incident, reflected, and transmitted wave 
fronts contributing to the energy flux though an element dx of an interface 
depend on the cosine of the angle of incidence for each wave. 

the interface equals that of the reflected and transmitted waves 
removing energy from the interface. The length of the wave 
fronts contributing to the flux depends on the angles of incid¬ 
ence. Figure 2.6-3 shows that the relevant lengths are cos ]\dx 
for the incident and reflected waves, and cos j 2 dx for the trans¬ 
mitted wave. Thus, for an incident wave of unit amplitude, the 
energy fluxes for the incident, reflected, and transmitted waves 
are 

E j-(D 2 p 1 /5 1 cos j t dx/2 
E R = R^ro^ft cos j t dx/2 

E r =T\ 1 co 1 p 1 P 1 cos j 2 dx/ 2. (14) 

These satisfy the conservation of energy 

Ej = E r + E t , (15) 

as proved in one of this chapter’s problems. The ratios of the 
transmitted and reflected energy fluxes to the incident energy 
flux are 


ir =T 2 PiP 2 C0S h 

^ x2 _ n 


P1P1 cos /i 


Because the energy ratios are proportional to the squares of 
the amplitudes, small amplitudes represent very small energies. 
For example, a reflected wave with R 12 = 0.1 has an energy 
ratio of E r /E z = 0.01. 

To see the angle dependence, consider an interface between 
media with /3 a = 3.9 km/s, p t = 2.8 g/cm 3 , and ft = 4.5 km/s, p 2 
= 3.3 g/cm 3 , which approximates the continental Mohorovicic 
discontinuity. Figure 2.6-4 shows the reflection and transmis¬ 
sion coefficients and the ratio of energy fluxes for angles of incid¬ 
ence between vertical and critical (58°). The energy flux ratios 
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Fig, 2.6-4 For an SH wave incident on a solid-solid boundary, 
displacement reflection and transmission coefficients and the ratios of 
reflected and transmitted energy fluxes to that of the incident wave are 
given as functions of the angle of incidence of the incident wave. The 
critical angle for these values is 58°. 

sum to one, so, as the reflected energy increases, the transmitted 
energy decreases. 

At vertical incidence and for most of the range of incidence 
angles less than the critical angle, most of the energy is trans¬ 
mitted. In this range, the vertical incidence reflection and 
transmission coefficients and energy flux ratios are good ap¬ 
proximations for nonvertical incidence. The behavior near the 
critical angle illustrates the value of considering the energies as 
well as the reflection and transmission coefficients. As the angle 
of incidence approaches the critical value, the transmission 
coefficient goes to 2, but the wave front factor cos j 2 goes to 
zero, so the energy in the transmitted wave vanishes and all of 
the energy reflects. 1 

2,6.4 Postcritical SH waves 



Transmission 

. coefficient 

.- energy ratio 


To see the effect on the transmitted wave of an apparent 
velocity less than that of medium 2, recall that the transmitted 
wave (Eqn 2) is described by 


u + y (x,z, t) = B'exp (i(cot-k x x-k x rpz)). (18) 

If c x < ft, the quantity (Eqn 2.5.8) 

% = D la (19) 

becomes an imaginary number. As a result, k x r^, the z com¬ 
ponent of the wavenumber, also becomes imaginary, so Eqn 18 
no longer describes a plane wave propagating in the direc¬ 
tion. The square root, which describes the imaginary number, 
has two possible signs. We pick the negative sign and define 


'•? 2 =d-4^2 2 ) : 


so that the z term in the displacement, 


exp (- ik x r pi z) = exp {-k x r^z), (21) 

decays exponentially away from the interface in medium 2 as 
z —» 00 . Thus, instead of being a propagating wave, the trans¬ 
mitted wave becomes an evanescent or inhomogeneous wave 
“trapped” near the interface. Choosing the negative sign in 
Eqn 20 is a radiation boundary condition, because the opposite 
choice gives displacement increasing with depth as z —» as if 
energy originated there. 

The behavior of the reflected wave for postcritical incidence 
results from the fact that the reflection coefficient (Eqn 9) 
becomes a complex number. Using Eqn 20 shows that 


fh% ~ 


( 22 ) 


This a complex number divided by its conjugate, so the mag¬ 
nitude of the reflection coefficient is 1, but there is a phase shift 


The transmitted and reflected waves behave differently for 
angles of incidence greater than the critical angle. Snell’s law, 

c x = ft/ sin j 1 = ft/sin ; 2 , (17) 

shows that for incidence angles less than the critical angle, the 
apparent velocity exceeds the velocity of the second medium, 
ft. At critical incidence, sin j 2 = 1, so the apparent velocity 
equals ft. For incidence angles greater than the critical angle, 
sin j t > sin ; c , so the apparent velocity c x = ft/sin j 1 is less than 

A/sin/>/J 2 - 

1 The wave angles and amplitudes can be shown by a simple experiment using 
beams of light (Klosko et ai, 2000). 


R 12 = e tl£ , e — tan 1 — (23) 

The phase shift depends on the angle of incidence. At critical 
incidence, c x — ft, so r^ 2 = 0 and £= 0°. As the angle of incidence 
increases beyond critical, £ increases until grazing incidence, 
j x - 90°, where c x - ft, = 0, and £- 90°. A 90° phase shift 
turns a sine wave into a cosine wave, and vice versa, whereas a 
180° phase shift is multiplication by -1. If the incident wave is 
made up of different frequencies, the phase shift affects each 
frequency, so the reflected wave can be computed using the 
Fourier transform. Figure 2.6-5 illustrates how the reflected 
wave would appear due to different phase shifts. 
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Fig. 2.6-5 The effect of phase shifts on a seismic waveform shown in the 
upper trace. (Choy and Richards, 1975. © Seismological Society of 
America. All rights reserved.) 
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Fig. 2.6-6 Geometry for a P wave in a halfspace incident upon a free 
surface. A v A 2l and B 2 are the amplitudes of the incident P , reflected P, 
and reflected SV waves. 


At the free surface, the traction vector, and hence the stress 
components o xV <J yz , o zz , must be zero for all % and t. a yz is 
automatically zero for P-SV waves in this geometry. Using 
Eqn 26, we express the other two stress components in terms of 
the potentials 


= a* 


^du du^ 
—* + —A 

= p 

2 * 

d 2 y/ 

+ -T- 

d 2 y 

^ dz dx y 


dxdz 

dx 2 

dz 2 j 


= X0 + 2fie zz = X 


P± iA 

^dx 2 dz 2 y 


+ 2 ji 



+ 


d 2 y/ ^ 
dxdz y 


(27) 


We then substitute the wave potentials from Eqns 24 and 25 
into Eqn 27 and evaluate them at z = 0: 


2.6.5 P-SV waves at a free surface 

Determining the amplitudes of reflected and transmitted waves 
is more complicated for the P-SV system because waves con¬ 
vert from one type to the other. To illustrate this, we consider 
the simple case when a harmonic plane P wave incident on a 
free surface generates two reflected waves, one P and one SV 
(Fig. 2.6-6). To determine their amplitudes, we use potentials 
for both P and SV, in contrast to the SH case, where we used 
the displacements directly, and find solutions that satisfy the 
free surface boundary conditions. 

There are two scalar potential terms, one for the upgoing 
incident P wave and one for the downgoing reflected P wave, 

fy{x, z, t) + $ R (x, z, t) = Aj exp ( i{a)t-k x x + k x r a z)) 

+ A 2 exp{i{cot-k x x-k x r a z)). (24) 

The downgoing reflected SV wave with amplitude B 2 is de¬ 
scribed by a vector potential with y component 

y/ R {x,z, t) = B 2 exp ( i(<x>t-k x x-k x rpZ )). (25) 


o xz (x, 0,t) = 0 

= /i[2r a (A 1 - A 2 ) + ( rj- l)P 2 ]^exp {i{cot-k x x)) 
<y zz {x , 0, f) = 0 

- ~[X( 1 + rDiA, + A 2 ) + 2\i{r 2 a {A x + A 2 ) 

+ rpB 2 )]kl exp (i(m-k x x)). (28) 

Regrouping terms shows that the ratios of the amplitudes of 
the reflected P and SV waves to that of the incident P wave can 
be found by solving the two equations 


2r„ 


A 

A 


(1 “ r s> V “ 2r a 


[(X+2p)(l + r 2 a ) - 2fi] A- + 2^ A- 

A A 

= 2p-(X + 2p.)(l+r 2 a ). 


(29) 


(30) 


Because (1 + r 2 a ) = (c 2 x la 2 ) = c 2 p!(X + 2/i), the last equation can 
be simplified to 


Using Eqn 2.5.5, the two nonzero components of the displace¬ 
ment are given by a combination of the P and SV potentials 


d(j) 

dx 


dy/ __ d(j) dy/ 
dz * z dz dx 


( 26 ) 


(c 2 x p-2p)^ + 2pr B ^ = 2p-c 2 x p. (31) 

A A 

Solving Eqns 29 and 31 using (1 + rj) = (cJ//3 2 ) = c x p!fi gives the 
amplitude ratios 
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R _ A 2 _ 4 r a r p - (rj - l) 2 
P A 4r a r p + {rj-l)^ 

R - A _ ^ Z 'll 

SV A 4 Vfi + (r 2 -l) 2 ' 

These can be written in many forms, including 

p _ a _ v - (yj - p 1 ) 2 


A 4 P 2 %rip + (Vp ~ P 2 ) 2 
R _ b 2 _ A>nJp^vj) 

SV Aj 4p\rip + ('nj-p 2 ) 2 


The last form has the advantage that at vertical incidence the 
vertical slownesses are r\ a = 1/a and rj^ = l//f, whereas r a and 
are infinite (Eqn 2.5.36). 

These amplitude ratios are the reflection coefficients for the 
P and SV potentials. In general, both reflected P and SV result. 
At vertical incidence the ray parameter p is zero, and Eqn 33 
shows two interesting features. First, none of the incident P 
wave converts to reflected SV energy {B 2 = 0). Second, the 
reflected P wave is inverted because A 1 /A 1 = - 1 . These effects 
also occur at grazing incidence, i = 90°, because r\ a is zero. 

The ratios of the displacement for the incident P and re¬ 
flected P and SV waves can be found from the potentials using 
Eqn 26: 

Incident P: {u x , u z ) PI = (-ik x9 

Reflected P: (w x , u z ) PR = (~ik x , -ik x r a)^ R 

Reflected SV: A x ,u x ) SR = (ik x r p ,-ik x )y/ R . (34) 

Because the displacements are real numbers, they can be found 
by taking the real part of the complex expressions or by adding 
the complex conjugates. 

Eising these expressions, the amplitude of any component of 
the displacement can be found from the potential reflection and 
transmission coefficients. Thus the ratio of the displacements 
can differ by either a sign or a scale factor from the potential 
reflection and transmission coefficients. To see this, consider 
the ratios of the magnitudes of the displacements. Because the 
components of the wave vectors for P and SV waves satisfy 

k a =\ k l + (K r al 2 r l = ( 0 la, kp=[k 2 + (k x rp) 2 ] V2 = o)/p, (35) 

the ratio of the magnitudes of the displacements for the 
reflected and incident P waves is 

\ U \PR _ __ 1A I 

M pi IAI ’ 

and the ratio of the magnitudes of the reflected S V and incident 
P displacements is 


' AXAy * 


Fig. 2.6-7 The length of the incident and reflected wave fronts 
contributing to the energy flux at an element dx of a free surface 
depends on the cosine of the angle of incidence for each wave. 
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We can gain further insight by considering how the incident 
wave’s energy is partitioned between the two reflected waves. 
From Eqn 2.4.65, a harmonic plane P wave has an energy flux 
in the propagation direction 

& = A 2 CQ z klpa/2, (38) 

and a similar result applies for an 5 V wave. The lengths of wave 
fronts contributing to the flux at an element dx of the free 
surface (Fig. 2.6-7) are cos i dx for the P waves and cos / dx for 
the S wave. Thus the energy fluxes for the incident, reflected P, 
and reflected SV waves are 

E PI = Alco 2 k 2 pa cos i dx/2 
E PR = A 2 G) 2 k 2 pa cos i dx/2 

E S r = Blco 2 kjp/5 cos; dx/2 , (39) 

so the ratios of the reflected energy fluxes to the incident energy 
flux are 
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a cos j 
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P cos i 



Because energy does not accumulate at the free surface, these 
ratios always sum to 1. 

Figure 2.6-8 shows an example of reflection coefficients and 
energy flux ratios as a function of the angle of incidence of the 
incoming P wave. Although there is no reflected SV wave at 
the limits, vertical and grazing incidence, there is a wide range 
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Fig. 2.6-8 For a P wave incident on a free surface, potential reflection and 
transmission coefficients and the ratios of reflected and transmitted energy 
fluxes to that of the incident wave are shown as functions of the angle of 
incidence of the incident P wave. 



Fig. 2.6-9 Geometry for a P wave incident on a solid-solid interface. A v 
A 2 , B 2 , A', and B' are the amplitudes of the incident P wave, the reflected P 
and SV waves, and the transmitted P and SV waves. 


of angles over which most of the energy reflects as 5 V. At two 
angles, the incident P wave converts entirely to SV. 

2.6.6 Solid-solid and solid-liquid interfaces 

The approach we used for P-SV waves at a free surface can be 
extended to a solid-solid interface. Consider the usual geo¬ 
metry (Fig. 2.6-9) in which P-SV waves propagating in the x-z 
plane interact with a horizontal interface at z = 0. An incident 
wave generates two reflected waves and two transmitted waves. 
The four ratios of the amplitudes of the reflected P and SV and 
transmitted P and SV waves to that of the incident wave are 
found from the boundary conditions. There are four equations 
because the x and z components of the displacement and trac¬ 
tion are continuous at the interface. The resulting solutions are 
complicated and are not given here. Instead, we consider some 
general principles and examples. 
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Fig. 2.6-10 Directions of propagation (solid line) and displacement 
amplitudes (dashes) for vertically incident, reflected, and transmitted 
P waves at a solid-solid interface. 


The solutions are simple for vertical incidence. For a vertic¬ 
ally incident P wave, no SV waves are generated. The displace¬ 
ment is only in the z direction, and the ratio of the displacement 
of the transmitted P wave to that of the incident wave is 

_ j 2Pi 

K )I 12 Pl«l + P2«2 ' 

The corresponding ratio for the reflected P wave is 

( u z)r _ _ Pi a i ~ Pi a 2 t (42) 

( u zh U ^I a i + P2 a 2 

These ratios, the vertical incidence transmission and reflection 
coefficients for displacements, satisfy 1 + P 12 = T 12 , as required 
by continuity of displacements. As for the SH case (Section 
2.6.2), the vertical incidence transmission and reflection coeffi¬ 
cients depend only on the acoustic impedances. For the incident 
SV case, no P waves are generated, and the ratios of the dis¬ 
placement component u x have the same form, but in terms of 
the shear velocity p. 

Figure 2.6-10 illustrates an intriguing effect that occurs for a 
P wave vertically incident on an interface where p x a t > p 2 a 2 , so 
R u is positive. If the incident P wave is a pulse in the +£ direc¬ 
tion of propagation with unit amplitude, then the reflected P 
wave is a pulse with amplitude P 12 in the +£ direction. Flence 
the motion in the incident wave is in its direction of propagation 
(+^), whereas the motion in the reflected wave is opposite to 
its direction of propagation (—z). Often the motion in a P wave 
is called compressional if it is in the direction of propagation, 
and dilatational if it is opposite the direction of propagation. 
Thus an incident P wave with a compressional motion yields 
a reflection with dilatational motion. Sometimes the positive 
amplitude of motion for a P wave is defined to be in the 
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Fig. 2.6-11 Interactions at a solid-solid 
interface between media having a x = 

6.8 km/s, = 3.9 km/s, Pj = 2.8 g/cm 3 , 
and = 8.0 km/s,/3 2 = 4.6 km/s, 
p 2 = 3.3 g/cm 3 . These values correspond 
approximately to the continental crust and 
mantle at the Mohorovicic discontinuity. 
Ray paths and ratios of reflected and 
transmitted energy fluxes to that of the 
incident wave are shown as a function 
of incidence angle for P and SV waves 
incident from above and below. 


propagation direction, so the reflection coefficient is defined 
with the opposite sign from Eqn 42. 

The amplitudes of the reflected and transmitted waves vary 
with the angle of incidence, as we illustrate by considering how 
the energy is partitioned between the four waves. Figure 2.6-11 
shows an example for velocities and densities approximating 
the continental Mohorovicic discontinuity. Ray paths and 
energy flux ratios for P and SV waves incident from above 
and below are plotted. The four ratios are between 0 and 1, and 
sum to 1 because energy is conserved. 


For a P wave vertically incident from above, the impedances 
p x a x ~ 19.0, p 2 a 2 = 26.4, yield reflection and transmission 
coefficients R 12 = -0.16, t 12 =o .84, and energy flux ratios of 

Is. = r2 = 0.03, V = Tj 2 2 ^ 2 , = 0.97. (43) 

E 7 Ej pA 

These ratios are a good approximation for angles of incidence 
less than the critical angle sin -1 (oq/a^) = 58° because almost 
all the energy is transmitted as P. However, as the angle of 















incidence approaches the critical value, the transmitted P 
energy goes to zero, and most of the energy reflects as P. For 
most postcritical incidence angles, up to -10% of the energy 
converts to 5 V, of which approximately half reflects and half is 
transmitted. In the limit of grazing incidence, however, all the 
energy is in the reflected P wave. 

For a P wave incident from below, the situation is similar 
except that there is no critical angle behavior. For vertical incid¬ 
ence, the reflection and transmission coefficients are R 21 = 0.16, 
p 21 = 1.16, and the energy flux ratios are the same as before, 
because 

Is. = R2 = 0.03, ^ = Tl^h. = 0.97. (44) 

E l Ejr Pi a 2 

At high angles of incidence, >70°, the energy is increasingly in 
the reflected P wave. 

The behavior of an S wave incident from above is analogous 
to that for a P wave incident from above. For this example, 
the S wave impedances are p 1 fi 1 = 10.9, p 2 (\ = 15.2, and the 
vertical incidence reflection and transmission coefficients are 
the same as for P waves. Hence at vertical incidence, almost all 
the energy is transmitted as S, a little reflects as S, and none 
converts to P. For near-vertical incidence, < -20°, this pattern 
changes slowly. At shallower angles of incidence, however, the 
situation is more interesting, because there are three critical 
angles. Approaching the critical angle for the transmitted P 
wave, sin" 1 (P t /(X 2 ) = 29°, the transmitted P energy increases 
somewhat. Beyond this angle there is no transmitted P wave, 
but the reflected P wave behaves in a similar way because it 
vanishes for sin" 1 (j3 1 /a 1 ) = 35°. For larger angles of incidence, 
only the reflected and transmitted S waves exist, and the energy 
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in the transmitted S wave falls off to zero at the critical angle 
sin" 1 (j= 58°. Beyond this angle, the incident S wave 
undergoes total internal reflection. 

The final case, an S wave incident from below, is analogous 
to that for a P wave incident from below. At vertical incidence 
almost all the energy is transmitted as 5, a little is reflected as 5, 
and none converts to P. There is a small reflected P wave near 
its critical angle, sin" 1 (P 2 /oc 2 ) = 35°. More noticeably, the 
transmitted P wave is enhanced near the critical angle for the 
S-to-P conversion, sin" 1 (P^ocf) = 42°. At higher angles of incid¬ 
ence, the transmitted S wave decreases as the reflected S wave 
increases. 

This example bears out the complexity of interactions at a 
solid-solid interface. The detailed behavior depends on the four 
velocities and two densities. A useful approximation is that for 
media with similar impedances, most of the energy goes into 
the transmitted wave of the same type (P or S) as the incident 
wave. This makes sense, because if the materials were identical, 
all the energy would be transmitted. For a wave incident from 
a lower-velocity medium, this is approximately the case for 
angles of incidence less than the critical angle for those two 
waves. For a wave incident from the higher-velocity medium, 
most of the energy is transmitted in the same type of wave until 
near-grazing incidence. Because the incident wave is not ser¬ 
iously affected by small impedance changes, waves propagat¬ 
ing through the earth change direction continuously according 
to Snell’s law, but change amplitude significantly only at inter¬ 
faces where the impedance contrasts are large. If this were not 
the case, we would not see distinct arrivals. 

The approach used for the reflection and transmission coeffici¬ 
ents at a solid-solid interface can be extended to a solid-liquid 
interface. Because there are no shear waves in the liquid, there 



Angle of incidence (°) Angle of incidence (°) Angle of incidence (°) 

Fig. 2.6-12 Ray paths and energy flux ratios for an interface between the ocean, with a x = 1.5 km/s, = 0.0 km/s, p- [ = 1.0 g/cm 3 , and an underlying crust 
with cq = 5.0 km/s, f5 2 - 3.0 km/s, p 2 = 3.0 g/cm 3 . Three cases, P waves incident from above and P and S V waves incident from below, are shown. 
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Fig. 2.6-13 Schematic illustration of a marine seismic experiment, in 
which a P wave generated in the water converts to P and S in the crust. The 
upgoing crustal S waves partially reconvert to P at the sea floor. Although 
no S waves travel through the water, the experiment can determine the 
S-wave properties of the crust. Not all reflected and transmitted waves 
are shown. 


are three amplitude ratios. Similarly, because the fluid’s shear 
velocity and rigidity are zero, there are three boundary con¬ 
ditions at the interface: continuity of vertical displacement and 
traction, and vanishing of the shear traction in the solid. 

Figure 2.6-12 shows the three possible cases at the sea floor: 
P waves incident from above and P and SV waves incident from 
below. Because the impedance contrast at the sea floor is much 
greater than in the Mohorovicic discontinuity example, the 
relative amplitudes of the reflected and transmitted waves are 
quite different from those in Fig. 2.6-11. First, consider a P 
wave incident from above. At vertical incidence, jR 12 = -0.82, 
T 12 = 0.18, so two-thirds of the incident energy reflects and 
only one-third is transmitted. As the angle of incidence increases, 
the fraction of reflected energy remains approximately the 
same, but the transmitted S wave grows at the expense of trans¬ 
mitted P. The first critical angle behavior occurs for transmit¬ 
ted P near sin -1 (oq/c^) = 17°. Beyond this angle, a significant 
transmitted S wave exists until the critical angle for the P-to-S 
conversion, sin" 1 ( a 1 //3 2 ) = 30°. For larger angles of incidence, 
the incident P wave is totally reflected. 

Comparison of this case with that of the P wave incident 
from above in the Moho example (Fig. 2.6-11) shows several 
differences. In both examples P waves impinge on a medium 
of higher velocity. Because the sea floor impedance contrast is 
much greater, most of the energy reflects at vertical incidence, 
and this situation persists for all angles of incidence. By con¬ 
trast, for the Moho example, most of the energy is transmitted 
until the critical angle. The critical angle for transmitted P 
occurs for much steeper incidence at the sea floor because the P- 
velocity contrast is much greater. The transmitted S behavior 
is very different in the two examples: oq > /3 2 for the Moho, 
so there is no critical angle for transmitted S. By contrast, at 
the sea floor a significant portion of the incident energy is 
converted and transmitted for angles less than the critical angle 
for transmitted S. 

The results for waves incident from below also differ signi¬ 
ficantly between the two examples. A P wave incident on the 
sea floor from below is primarily reflected downward, largely 



Fig. 2.6-14 Schematic illustration of a seismic reflection experiment, 
in which vertically incident waves reflect from a region with a variable 
velocity structure. The vertical ray paths are offset for clarity. The 
media have cq = 2.6 km/s, p 1 = 2.5 g/cm 3 , a 2 = 1.7 km/s, p 2 = 2.0 g/cm 3 , 
a 3 = 2.2 km/s, p 3 = 2.2 g/cm 3 , a 4 ~ 2.3 km/s, p 4 = 2.3 g/cm 3 . Impulse 
seismograms showing the arrivals resulting from an incident P- wave 
pulse of unit amplitude are plotted with time increasing downward. 

The resulting arrivals have amplitudes R 12 = 0.3, T l2 R 23 T 2 i - 0.2, 
T 12 T 23 P 34 T 32 T 21 ~ ""0*62, and are separated by the time required to 
traverse the layers. The corresponding reflection from a point to one side 
of the region has amplitude R 14 = 0.1. (After Dobrin, 1976.) 


as a P wave for angles less than -20°, and largely as an S wave 
for angles greater than -30°. Less than one-third of the energy 
is ever transmitted. By contrast, for the Moho example, almost 
all the incident P energy is transmitted until near-grazing incid¬ 
ence. For an S wave incident from below, all the energy reflects 
as S at vertical incidence, because there is no transmitted S in 
the water. At low angles of incidence, the fraction of reflected P 
increases until near the critical angle sin" 1 (/3 2 /a 2 ) = 37°. For 
most angles of incidence, a significant portion of the incident 
upgoing S wave is converted to upgoing P and transmitted. 
This strong converted transmission does not occur in the Moho 
example. 

The facts that P waves incident from the water give rise to 
significant S waves in the crust and that S waves incident from 
the crust yield substantial transmitted P waves in the water 
have important consequences for marine seismology. Seismic 
sources in the water can generate transmitted S waves in the 
crust, whose propagation can be studied using P waves re¬ 
converted at the sea floor from upcoming S waves. Thus the 
oceanic crust and upper mantle can be studied with both P 
waves and S waves, using sources that generate only P waves 
and receivers that detect only P waves (Fig. 2.6-13). 








2.6.7 Examples 

Using the amplitudes of reflected, converted, and transmitted 
waves to study interfaces is common in seismology, as we illus¬ 
trate with two examples. In reflection seismology, P waves gen¬ 
erated by near-surface sources and reflected from interfaces at 
depth are used to study the crust and uppermost mantle. We 
will see in the next chapter that the downgoing waves impinge 
on the reflectors at steep angles of incidence, and the data 
are often processed to simulate vertical incidence. Because the 
impedance contrasts are small, it is common to neglect P-to-S 
conversions and estimate the amplitudes of the reflected and 
transmitted P waves using vertical incidence reflection and 
transmission coefficients. The reflection and transmission coef¬ 
ficients inferred from seismic data are combined with the travel 
times to yield information about the subsurface geology. 

Consider (Fig. 2.6-14) a hypothetical region where natural 
gas, oil, and saltwater are trapped in the pores of a sand unit. 
To describe the response of this region to a P wave impulse of 
unit amplitude, we consider only the first, or primary , reflec¬ 
tion from each layer, because subsequent multiple reflections 
would be smaller. The resulting arrivals have amplitudes R 12 , 
T l2 R 23 T 21 , and T 12 T 23 R 34 T 32 T 2V and are separated by the 
time required to traverse the layers. By contrast, the corres¬ 
ponding reflection from a point to one side of the region would 
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have amplitude R 14 . The lateral variation in impedance con¬ 
trasts causes a significant difference in the amplitude of the 
reflected waves. 

For a second example, consider the downgoing slab of 
lithosphere at a subduction zone. As discussed in Chapter 5, 
the slab is colder than the surrounding mantle, and hence has 
higher seismic velocity. Seismic waves propagating in several 
geometries (Fig. 2.6-15) are used to study the upper surface of 
the slab. In one, ScS, an S wave reflected at the core-mantle 
boundary, is partially converted to a P wave, ScSp, at the slab 
surface. The ray paths can be found by using Snell’s law at 
the dipping interface. Assume that the downgoing slab and 
overlying mantle have velocities a v and c^, /3 2 and the slab 
dips at angle 6. A vertically traveling ScS wave impinges on 
the interface at an angle j t = 6, so the angles of incidence for 
transmitted ScS and ScSp are j 2 = sin -1 [(j3 2 /ft) sin j-J and 
i 2 = sin -1 [{otjfif) sin/j]. The amplitude of ScSp is enhanced 
because the ScS incidence angle is close to the critical angle 
for the conversion. ScSp travels faster than ScS and appears 
at seismometers primarily on the vertical component, whereas 
ScS arrives later and is primarily on the horizontal compon¬ 
ent. Additional information is obtained from P waves that 
reflect off the interface and appear at seismometers above 
the subduction zone later and with higher apparent velocity 
(steeper incidence) than the direct arrival. The travel times and 


Fig. 2.6-15 Study of a subducting slab 
using seismic waves reflected and converted 
at its upper surface. Top: Ray paths for the 
conversion of upcoming ScS to ScSp and the 
reflection of P waves. (Helffrich et al., 1989. 
/. Geophys. Res., 94, 753-63, copyright by 
the American Geophysical Union.) Lower 
left: Application of Snell’s law at the 
dipping interface for the ScS to ScSp 
conversion. Lower right: Seismograms 
showing ScS and ScSp recorded in 
Hokkaido, Japan, for an earthquake in 
Honshu, Japan. ScSp arrives on the vertical 
component before ScS appears on the 
horizontal components. (Snoke etal., 

1979.) 
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Fig. 2.7-1 Three-component seismogram of a magnitude M w 7.7 shallow earthquake in the Vanuatu trench recorded 12,250 km away at station CCM. 
Note the large size of the surface waves compared to the preceding body waves. The Love wave is observed on the transverse component, and the Rayleigh 
wave is primarily seen on the vertical and radial components. 


amplitudes of these waves are used to estimate the depth to the 
interface and the velocity contrast there, and hence to draw 
inferences about its thermal and mineralogical state. 

2.7 Surface waves 

2.7.1 Introduction 

After our discussions of P and S waves, we might expect that 
the seismogram resulting from an earthquake would consist of 
pulses when P and S waves arrive, with later arrivals reflected 
and converted at interfaces within the earth. Generally, how¬ 
ever, seismograms (Fig. 2.7-1) are dominated by large longer- 
period waves that arrive after the P and S waves. These waves 
are surface waves whose energy is concentrated near the earth’s 
surface. As a result of geometric spreading, their energy spreads 
two-dimensionally and decays with distance r from the source 
approximately as r _1 , whereas the energy of body waves 


spreads three-dimensionally and decays approximately as r" 2 
(Section 2.4.3). Thus, at large distances from the source, sur¬ 
face waves are prominent on seismograms. 

Two types of surface waves, known as Love waves and 
Rayleigh waves after their discoverers, 1 propagate near the 
earth’s surface. Figure 2.7-1 shows a large surface wave train 
arriving on a seismometer’s transverse component, followed by 
another wave group on the vertical and radial components. We 
will see that the first wave train contains Love waves resulting 
from SH waves trapped near the surface. The second wave 
group contains Rayleigh waves, which are a combination of 
P and SV motions. In our usual geometry (Fig. 2.7-2) of waves 
propagating in the x-z plane, the Rayleigh wave displacement 
is in this plane, and the Love wave displacement is parallel to 
the y axis. In this section, we examine the simplest cases of 

1 Lord Rayleigh (1842-1919), best known among seismologists for pioneering 
work in wave propagation, was awarded the Nobel prize for the discovery of argon. 
A. E. H. Love (1863-1940) made fundamental contributions to both seismology and 
geodynamics. 
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Fig. 2.7-2 Geometry for surface waves propagating in a vertical plane 
containing the source and receiver. Rayleigh ( P-SV) waves appear on the 
vertical and radial components. Love ( SH) waves appear on the transverse 
component. 



Distance (°) 

Fig. 2.7-3 Multiple surface waves circle the earth. Right : Odd-numbered 
arrivals {R v R 3 , etc.) take the shortest path from the earthquake to the 
station, whereas even-numbered arrivals ( R 2 , R 4 , etc.) travel in the 
opposite direction. Left: Travel times for multiple Rayleigh (RJ and 
Love waves (G n ). 

Rayleigh and Love waves, and use them to demonstrate some 
general ideas about surface waves. 

An interesting difference between surface and body waves, 
due to their different rates of decay, is that surface waves can 
circle the globe many times after a large earthquake. Figure 2.7- 
3 shows such multiple surface waves , which are denoted as 
Rayleigh waves ( R n ) and Love waves ( G n ). The travel time plot 
(Fig. 2.7-3, left) illustrates the increasing time required for 
successive paths, indexed by n, from the earthquake to the 
station. An important feature of surface waves is dispersion , 
the fact that waves of different periods travel at different 
velocities. As a result, the surface wave arrivals are not sharp 
lines, but are spread out in time. These effects are shown in Fig. 
2.7-4 (overleaf) by a record section composed of many vertical 
component seismograms at different distances from earth¬ 
quakes, which yields an observed travel time plot. The data show 
the arrivals of R v R 2 , P 3 , and P 4 , and a comparable 6-hour 
plot for the transverse component would show G a through G 5 . 


2.7.2 Rayleigh waves in a homogeneous half space 

Rayleigh waves are a combination of P and SV waves that can 
exist at the top of a homogeneous halfspace. To describe them, 
we define the free surface as z = 0, measure z downward, and 
use potentials for waves propagating in the x-z plane. We 
consider only P and SV waves, because they can satisfy the free 
surface boundary conditions and do not interact with SH 
waves. The P and SV potentials are 

0 = A exp {i{o)t- k x x- k x r a z)), 

y/=B exp (i{cot-k x x-k x r^z)). (1) 

For a combination of these potentials to describe energy 
trapped near the free surface, two conditions must apply. The 
solution must both ensure that the energy does not propagate 
away from the surface and satisfy the free surface boundary 
conditions. 

For the energy to be trapped near the surface, the exponen¬ 
tials exp {-ik x r a z) and exp {~~ik x r^z) must have negative real 
exponents, so that the displacement will decay as z -» 00 . Because 

r a =(c 1 x lo l - 1) 1/2 , rp-{clip 1 -!) 111 , (2) 

this radiation condition requires that c x < ft < ct, so that both 
square roots become imaginary, with a choice of sign such that 

r a =-i{!-c 2 x la 2 ) 112 , rp=-i{!-c 1 x lp 1 ) 111 . (3) 

Thus c x , the apparent velocity along the surface, must be less 
than the shear velocity. 

The other condition, the vanishing of traction at the free 
surface, arose for the P-SV reflection at a free surface (Section 
2.6.5). The difference here is that the boundary conditions are 
satisfied with no incident wave. Using Eqn 2.6.28 without an 
incident wave shows that when the stress components are ex¬ 
pressed in terms of the potentials, the amplitudes A and B must 
satisfy the continuity equations 

a xz (x, 0,t) = 0 = 2 r a A + (1 - r\)B, 

o zz (x, 0, t) = 0 = [A( 1 + rD + 2[lr £ ]A + (4) 

Eliminating the Lame constants from the second equation 
using (1 + r 1 ) = c^la 1 and the definitions of the velocities a and 
p gives a system of two homogeneous linear equations for 
A and P, 

2(c 2 /a 2 - 1) 1/2 A + (2 - c 2 x lp 2 )B = 0, 

(c 2 x lp 2 -2)A + l{c 2 x /p 2 - !) m B = 0. (5) 

This system has nontrivial solutions if the determinant of the 
system is zero (Section AAA), such that 
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Fig. 2.7-4 Record section formed from 
vertical seismograms at stations of the 
IDA (International Deployment of 
Accelerometers) network. The R : through 
R 4 arrivals are spread out in time due to 
dispersion and contain lines of energy 
that cross the largest amplitudes at 
small angles. As discussed later, the 
lines show the phase velocity, and the 
overall amplitude pattern shows the group 
velocity. Body wave arrivals appear before 
and after R v (Shearer, 1994. Eos, 75, 449, 
451,452. Copyright by the American 
Geophysical Union.) 


(2 - clip 1 ) 1 + 4 (clip 1 -1 ) ll2 (c 2 x la 2 - 1 ) m = 0. (6) 

For a halfspace with given velocities a and ft this equation 
gives the values of c x that satisfy the free surface boundary con¬ 
dition. Of the four roots, one is zero, and only one is consistent 
with the requirement that 0 < c x < ft For a Poisson solid, in 
which a 2 ip 2 = 3, the determinant becomes 

(c 2 x lp 2 )[c 6 x lp 6 - Sc 4 x lp 4 + ( 56l3)c 2 x lp 1 - 32/3] = 0. (7) 

If we reject the trivial solution clip 2 = 0, the equation is a 
cubic in c 2 //3 2 , with roots 4, 2 + 2ly3 (~ 3.155) and 2 - 2ft/3 
(~ 0.845). Only the last root satisfies c x < ft the condition for 
waves to be trapped at the surface. Thus the apparent velocity 
of the Rayleigh wave in a halfspace that is a homogeneous 
Poisson solid is c x = (2- 2A/3 )/J = 0.92 ft slightly less than the 
shear velocity. 


The coefficients of the potentials (Eqn 1), which can be found 
from Eqn 5, are 

B=A(2-c 2 x lp 2 )l(2rp) (8) 

and can be substituted into the potentials and used to find 
the displacements (Eqn 2.6.26). Taking the real parts of the 
exponentials and using the numerical values of c x /p and cja 
for a Poisson solid gives 

u x -Ak x sin (cot-k x x) [exp (-0.85 k x z) 

- 0.58 exp (-0.39 k x z)], 
u z = Ak x cos (&tf-ft.x)[-0.85 exp (-0.85 k x z) 

+ 1.47 exp (-0.39 k x z)}. (9) 

The displacement can be characterized by its variation in 
depth and distance along the surface. Both components are 
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Fig. 2.7-5 Variation with depth of the x and z components of 
displacement for a Rayleigh wave in a halfspace composed of a Poisson 
solid. Both components decay with depth, plotted here normalized by the 
horizontal wavelength. 


sinusoidal functions of {cot - k x x), and thus harmonic waves 
propagating in the +x direction. Because the harmonic wave 
solution applies only in the x direction, the meaningful 
wavelength is the horizontal wavelength along the surface, 
X x = 2nlk x . The displacement decays with depth as exp (- k x z ) 


(Fig. 2.7-5), so the depth to which a Rayleigh wave has signific¬ 
ant displacement is proportional to its horizontal wavelength. 

At the surface, 2 = 0, and the displacement components are 

u x = 0A2Ak x sin (cot-k x x ), 

u z = Q,62 Ak x cos {(Ot~k x x). (10) 

To visualize these, consider the motion of a particle of material 
at x = 0 as a function of time. At t = 0, u z is a maximum {z is 
positive downward), and u x = 0. As time increases, the x and z 
displacements combine to give counterclockwise, or “retro¬ 
grade”, motion about an ellipse (Fig. 2.7-6, left). For a Poisson 
solid, the maximum vertical displacement at the surface is 
about 1.5 times the maximum horizontal displacement. The 
particle motion becomes “prograde” below a depth of about a 
fifth of the wavelength, because the decaying exponential term 
in u x becomes negative. 

The phase relation between the horizontal and vertical com¬ 
ponents of Rayleigh wave motion can be seen on seismograms, 
as shown in Fig. 2.7-6 {right). When the vertical displacement 
is at a negative maximum (e.g., about 785 s), the radial dis¬ 
placement is zero, corresponding to t - 0 in Fig. 2.7-6 (left). A 
quarter-period later (e.g., about 790 s) the vertical displacement 
is zero, and the radial displacement is at its positive maximum, 
corresponding to t = TI4. 

Rayleigh waves also exist when the medium is more complic¬ 
ated than a homogeneous halfspace. In this case, rather than 
having a single apparent velocity for all frequencies, c x is a 
function of frequency. We illustrate this idea next using Love 
waves. 


Direction of wave propagation Rayleigh wave phase relationships: vertical and radial components 



Fig. 2.7-6 For a Rayleigh wave, the horizontal (radial) and vertical components of ground motion are out of phase in a characteristic fashion. 

Left: Because the components are out of phase, the particle motion at a point on the free surface as a function of time is a retrograde ellipse. The particle 
moves opposite the direction of wave propagation at the top of the ellipse. Right: Comparison of the displacement components from seismograms of an 
earthquake in the Kuril Islands recorded in Micronesia, showing that one peaks when the other is zero. 
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Fig. 2.7-7 Layer over a halfspace geometry for Love waves. Love waves 
exist if the layer’s shear wave velocity is less than the halfspace velocity. 
The waves can be treated as constructive interference between SH waves 
incident on the interface beyond the critical angle. 


2.7.3 Love waves in a layer over a half space 

A second type of surface wave, a Love wave, results from the 
interactions of SH waves. The simplest geometry (Fig. 2.7-7) in 
which a Love wave occurs is a layer of thickness h of material 
with velocity underlain by a halfspace of material with a 
higher velocity p 2 . Love waves require a velocity structure that 
varies with depth, and so cannot exist in a halfspace, in con¬ 
trast to Rayleigh waves. 

To describe the Love waves, we write the SH -wave displace¬ 
ment in the layer as the sum of an upgoing and a downgoing 
wave: 

u~(x, z,t) = B 1 exp (i(cot - k x x - k x r p z )) 

+ B 2 exp {i(cot-k x x + k x r p z)). (11) 

In the halfspace we need only one term: 

u + y (x,z, t) = B f exp (i(cot — k x x - k x r p z)). (12) 

As before, we impose a radiation boundary condition that 
ensures that energy not travel into the halfspace as a pro¬ 
pagating wave. Energy will be trapped near the interface 
if exp {~ik x rp z) is a negative real exponential that decays as 
z -» oo. This condition occurs if the apparent velocity is less 
than the shear velocity in the halfspace, c x < p 2 , so 

% = = = (13) 

The amplitudes B v B 2 , and B' are found using the boundary 
conditions at the free surface and at the interface between the 
layer and the halfspace. At the free surface, z- 0, the traction 
must be zero for all x and t, 

( du y) 

<j yz (x, 0, t) = p x (x,0,f) 

= p 1 {ik x r pi )(B 1 -B 1 ) exp (i(wt~k x x)) = 0, (14) 

so B-t = B 2 . At the interface z = h, the displacement must be 
continuous for all x and f, so 


B a [exp (-ik x r p h) + ex p (ik x r p h)]=B f exp (-ik x r p h). (15) 

Similarly, the stress component o yz must also be continuous at 
the interface for all x and t , so 

^Hk^B^exp (-ik x r p h)-exp {ik x r p h)] 

= A^HVj9 2 ) B ' ex P ( ~ ik x % h )• ( 16 ) 

By combining the complex exponentials into sine and cosine 
functions (Eqn A.2.10), conditions 15 and 16 can be written 

2B t cos (k x r p h)-B' exp (~ik x r p h), 

2 ip 1 %B l sin (k x r p h) = -p 2 r l5 B / exp {-ik x r p h). (17) 

Dividing the second condition by the first gives 

tan (k x r p h) = (-WfoWfi i^) - (18) 

This equation has a special significance. It gives a relation 
between the horizontal wavenumber, k x , and the horizontal 
apparent velocity, c x , that must be satisfied for the Love 
wave to exist. Because c x = co/k x , this means that, for a given 
horizontal apparent velocity, Love waves must have specific 
horizontal wavenumbers and thus angular frequencies. Altern¬ 
atively, for a particular period or angular frequency, Love 
waves can have only certain horizontal apparent velocities 
or wavenumbers. Hence different frequencies have different 
apparent velocities, a phenomenon that is called dispersion. 
Relations like Eqn 18, which give the apparent velocity, c x , 
as a function of a) or k x , are called dispersion relations , or 
period equations. 

Before examining the dispersion relation further, we derive 
it in a different way. The apparent velocity condition c x < p 2 
(Eqn 13) also arose (Section 2.6.4) for SH waves incident on an 
interface at angles exceeding the critical angle, sin" 1 (p 1 lp 2 ). In 
the geometry of Fig. 2.7-7, these waves are totally reflected 
both at the interface and at the free surface, and so are trapped 
in the layer. 

Consider the portion of the ray path ABQ along which a 
downgoing wave with incidence angle j 1 reflects at the interface 
and then at the free surface. If the phase of the wave changes by 
an integral multiple of l7C, the downgoing wave front normal 
to the ray path at Q will be in phase with, and thus interfere 
constructively with, the downgoing wave front normal to the 
ray path at A. The phase change in going from A to Q consists 
of two terms, one due to the reflections and one due to the 
propagation. By Eqn 2.6.23, the postcritical reflection causes a 
phase change of 2 tan -1 [(/^ry/^r^)], whereas the free sur¬ 
face reflection does not change the phase. In addition, because 
the wave propagated a distance AB + BQ, the phase changes by 
-(AB + BQ)k pi . The distance can be written as 

AB + BQ=BQ cos 2 j\ + hi cos j x 

= (cos 2j 1 + 1)(/ 7 /cos j 1 )~2h cos/p (19) 
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using cos 2 j 1 = cos 2 j 1 — 1. The condition for constructive inter¬ 
ference is thus that the total phase change 

-2k ph cos ji + 2 tan -1 [(^^/(/i^)] = 2 «tt, (20 ) 

or, because tan (w7r) = 0, 

tan (kph cos jf) = tan (k x Tph) = {p^p/ip^). (21) 

Thus the Love wave dispersion relation that we derived from 
the boundary conditions can also be viewed as an interference 
criterion for critically reflected SH waves, corresponding to 
propagating waves in the layer and an evanescent wave in the 
higher-velocity halfspace. 

2.7.4 Love wave dispersion 

The dispersion relation (Eqn 21) can be written as a function 
of any two of the three related parameters c x , co, and k x . To 
find solutions, we write it in terms of frequency and apparent 
velocity as 

tan [(wh/c^/pj-l) 112 } = ^ ~ f ^ • (22) 

Because the tangent function is defined for real values, the 
square roots must be real, so the apparent velocity is bounded 
by P t < c x < P 2 . A graphical solution can be derived by defining a 
new variable, 

X=(Mc x )(cyp\-l) m , (23) 

so that over the allowable range of the apparent velocity, 
f = o at c x = p v and f max = h(Up\ - 1//3|) 1/2 at c x = p 2 . Hence, 
Eqn 23 becomes 


tan (<»£) = 


Ud-ciipjr) 

f *) 

^ J 

K 

1 ^ 


(24) 


As shown in Fig. 2.7-8, the left side of the equation, tan {coQ, 
has zeroes at if=nnlco and goes to infinity at f = k!2co, Snllco , 
etc. The right side of the equation, which has a hyperbolic ap¬ 
pearance because of the 1 1if dependence, is infinite for c x = P v 
where if= 0, and decays monotonically to zero at c x - p 2 , where 
if = f max . Solutions exist where the two curves intersect, giving 
the values of if and thus c x for which a Love wave with a given co 
occurs. The solutions are called modes , so that for a given fre¬ 
quency there are several modes, each with a different apparent 
velocity. The leftmost solution, with the lowest c x , is called the 
fundamental mode ; the others are higher modes , or overtones , 
numbered 1 through n. 

Figure 2.7-8 illustrates Eqn 24 for three different periods 
using a model for the continental crust and mantle of a 40 km- 
thick layer with jS a = 3.9 km/s and p x = 2.8 g/cm 3 underlain by 


Period = 5 s 



Love waves 

pi = 3.9 km/s 
p 2 = 4.6 km/s 
pi = 2.8 g/cm 3 
p 2 = 3.3 g/cm 3 
h = 40 km 


c 


Period = 10 s 



c 


Period = 30 s 



C 

Fig. 2.7-8 Graphical solution of the dispersion relation for Love waves 
in a layer over a halfspace. The left side of Eqn 24 is represented by 
the solid curves, tan (coQ, with zeroes at rmlco. The decreasing dashed 
hyperbolas represent the right side of Eqn 24. The intersections of the 
curves (dots) are the roots of the equation and give the apparent velocities 
for a given period. The apparent velocities range between the shear 
velocities of the layer (j3j) and the halfspace (/J 2 ). For longer periods there 
are fewer solutions and thus fewer modes. 


a half space with p 2 = 4.6 km/s and p 2 = 3.3 g/cm 3 . For waves 
with a period of 5 s, there are three solutions within the allowed 
apparent velocity range: c x - 3.92,4.13, and 4.55 km/s. 

Consider now what happens for longer periods or lower fre¬ 
quencies. The zeroes of the tangent curve f = nnlco increase, 
so the spacing between the tangent curves, nico , also increases. 
As a result, there are fewer tangent curves within the allowable 
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n = 2 



Fig, 2,7-9 Dispersion curves giving the relationship between apparent 
velocity and period for Love waves in a layer over a halfspace. For each 
mode, the apparent velocities range from the layer velocity to the 
halfspace velocity /3 2 . The bottom curve is the fundamental mode branch, 
and the overtone branches are above it, with higher velocities for any 
period. Dots show the modes from Fig. 2.7-8. 

range of which is nnlto < £ max . Thus, because the decaying 
curve does not depend on o), there are fewer solutions, c x , for 
longer periods. For any given angular frequency, the solution 
with the largest possible value of f occurs when the n th solution 

is ?max> so = ft- In this CaSe > tan <max = °» SO <max = n7C > and 

a=w cn = nn/[h(l/pl-l/p$) V2 ]. (25) 

This angular frequency, called the cutoff angular frequency for 
the w th higher mode, is the lowest oo at which this mode exists. 
Tangent curves with larger values of n are beyond the allowed 
range of f. Thus, for sufficiently long periods, only the funda¬ 
mental mode exists. 

Using this method, we can compute the apparent velocity 
values for different periods. Figure 2.7-9 shows the resulting 
curves, known as mode or overtone branches , for the funda¬ 
mental mode and the first two higher modes. At the longest 
periods only the fundamental mode exists, whereas for shorter 
periods higher modes occur. For example, at a period of 5 s 
there are three modes, for 10 s there are two modes, but at 30 s 
only the fundamental mode occurs. The longest-period modes 
for each branch have c x —» /3 2 , so their apparent velocity 
depends on the shear velocity in the halfspace and is essentially 
unaffected by the shear velocity in the layer. Thus at long 
periods the branches in Fig. 2.7-9 approach the velocity in the 
halfspace, f$ 2 = 4.6 km/s. Similarly, the shortest-period modes 
for each branch have c —» P 1 = 3.9 km/s, so their apparent 
velocity approaches the layer velocity. 

This variation in apparent velocity reflects differences in dis¬ 
placement among the modes. In the layer, because the ampli¬ 
tudes B 1 and B 2 of the upgoing and downgoing waves are 
equal, the displacement (Eqn 11) can be written 

u~(x i z,t) = 2B 1 exp ( i(cot-k x x )) cos (k x r^z). (26) 

In the halfspace, the displacement (Eqn 12) is 


u+{x,z,t) = B' ex p (i(cot~~k x x)) exp (~k x r^z), (27) 

so, by the continuity of displacement at the interface z = h, 

B' = 2B 1 cos (k x r^h)/ex p (~k x rfh), (28) 

Thus, in both the layer and the halfspace, we have a wave 
propagating in the x direction, with horizontal wavenumber 
k x = 2%!X x = colc x . In the layer, the displacement varies with 
depth as cos (k x r^z) 9 and so oscillates. In the halfspace, the dis¬ 
placement decays exponentially with depth as exp (~k x r 

The variation m displacement in the x and z directions is 
illustrated in Fig. 2.7-10 for the three periods whose apparent 
velocities were found in Fig, 2.7-8, The horizontal variation 
is shown in the upper panels. Because the apparent velocity 
increases with period (Fig. 2.7-9), the horizontal wavelength 
increases with period for a given branch. Thus, for the funda¬ 
mental mode (n = 0) cases shown, the longest period (30 s) 
has the highest apparent velocity and thus the longest hori¬ 
zontal wavelength. At a given period (Fig. 2.7-9), the higher 
the mode, the higher the apparent velocity, and thus the longer 
the horizontal wavelength. Hence for the three modes shown 
for period 5 s, n = 2 has the longest horizontal wavelength. 

The variation with depth, known as the mode’s vertical 
eigenfunction , is different for each mode. For a given branch, 
the depth of penetration in the halfspace increases with period, 
so, of the fundamental mode periods shown, the longest (30 s) 
“sees” deepest into the higher velocity halfspace, and thus has 
the highest apparent velocity. Conversely, the shortest period 
modes on a given branch penetrate to the shallowest depth, and 
thus have the lowest apparent velocity. At a given period, the 
higher modes oscillate more rapidly with depth in the layer, 
and so change sign more frequently. In the halfspace, however, 
the higher modes decay more slowly and penetrate deeper. The 
eigenfunction for a mode with order n has n zero crossings, or 
nodes, with depth. 

The fact that the displacement behaves differently with depth 
for various modes and periods makes Love waves dispersive. 
In our derivation, the intrinsic shear velocities of the layer 
and halfspace do not depend on frequency. Nonetheless, the 
resulting apparent velocity along the free surface depends on 
frequency. This dispersion results from the fact that Love waves 
of different periods have different displacements with depth, 
and the intrinsic medium velocity varies with depth. As a result, 
surface wave dispersion is valuable for studying earth structure. 

By contrast, the halfspace Rayleigh wave does not show this 
dispersion. This wave is a “true” surface wave because it can 
exist in a homogeneous halfspace due to the interaction of P 
and SV waves. By contrast, the Love wave in a layer over a 
halfspace exists because the properties of the medium vary with 
depth, and so cause interference between SH waves. Dispersive 
Love waves and Rayleigh waves also occur in media whose 
properties vary with depth in a more complicated way. The dis¬ 
persion curves for Love and Rayleigh waves in such media can 
be calculated by several methods. One approach is to extend 
the method used in Section 2.7.3 by treating the medium as a 
set of homogeneous layers underlain by a halfspace. As for the 
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Fig. 2.7-10 Variation in displacement along 
the surface (top) and as a function of depth 
(bottom) for Love waves in a layer over a 
halfspace. The figure shows the modes for 
the three periods from Figs 2.7-8 and 9. 


Love wave displacement with depth 




one layer case, we assume that the displacement in each layer is 
given by the exponential solutions, and find combinations of 
frequency and horizontal apparent velocity that satisfy the 
boundary conditions at the free surface, at each layer bound¬ 
ary, and in the halfspace. Another approach is to view surface 
waves as the normal modes of the spherical earth (Section 2.9). 

2.8 Dispersion 


because its apparent velocity along the surface varied with 
frequency. To explore dispersion further, we first consider the 
simplest example, the net effect of two harmonic waves with 
slightly different frequencies and wavenumbers. We next con¬ 
sider dispersion in general terms, and discuss some features of 
surface wave and tsunami dispersion. 

Consider the sum of two harmonic waves with slightly dif¬ 
ferent angular frequencies and wavenumbers 

u(x, t)= cos {(D 1 t-k 1 x) + cos (co 2 t~k 2 x). (1) 


2. 8.1 Phase and group velocity 

In the last section, we saw that the Love wave was dispersive, 


The angular frequencies and wavenumbers can be written in 
terms of the differences from their average values ©and k : 
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co 1 = co+ Sco , g> 2 = 6)-“ <5cg, ft)» <5g), 

= ^ + k 2 = k~Sk, k»8k. (2) 

Using this substitution, we add the two cosines and simplify, 
yielding 

u(x 9 t) = c os {cot + Scot - kx - 8kx) 

+ cos (cot - Scot -kx + Skx) 

= 2 cos (cot- kx) cos (Scot-8kx). (3) 

Thus the sum of the two harmonic waves is a product of two 
cosine functions (Fig. 2.8-1). By their arguments, both corres¬ 
pond to propagating harmonic waves. Because Sco is less than 
co, the second term has a lower frequency, and so varies more 
slowly with time than the first. Similarly, because Sk is less than 
k, the second term varies more slowly in space. Thus we have a 
carrier wave with angular frequency o and wavenumber k } on 
which a slower varying envelope with angular frequency Sco 
and wavenumber Sk is superimposed. 1 

Examination of when the phase of each term remains con¬ 
stant shows that each describes waves traveling at a different 
speed. The envelope, or beat pattern, propagates at the group 
velocity 

U-Sco/Sk, (4) 

whereas the carrier moves at the phase velocity , 

c=oo/k. (5) 

The difference between these two velocities is illustrated by 
Fig. 2.8-1. Comparison of the signal at different times shows 
that the envelope propagates at a different speed from the car¬ 
rier. This difference explains why in the surface wave data of 
Fig. 2.7-4 individual lines had a slope (phase velocity) differing 
from the slope (group velocity) of the overall wave pattern. 




(b) 



Envelope Carrier 


x -► 

Fig. 2.8-1 Two sinusoidal waves with slightly different frequencies and 
wavenumbers (a). Their sum as a function of time (b) yields a beating 
pattern, or long-period envelope, which propagates at the group velocity, 
U. The carrier, the high-frequency oscillation whose amplitude is 
modulated by the envelope, propagates at the phase velocity, c. 


2.8.2 Dispersive signals 

Because dispersive waves of different frequencies propagate at 
different speeds, this process is best viewed by using Fourier 
analysis to decompose a wave into the frequencies that 
compose it. Hence, although we discuss Fourier analysis in 
Chapter 6, we introduce some key concepts here without 
proof. For a function of time f(t), multiplication by the com¬ 
plex exponential e~ I03t and integration over all time yields a 
function of angular frequency CO: 


F(co) = 


oo 

r 


f(t)e~ im dt 


J 

™ oo 


( 6 ) 


known as the Fourier transform of f(t). Because the integral 
involves a complex exponential, F(co) is generally a complex 
function. Similarly, f(t) and F(co) are related by the inverse 
Fourier transform : 


M = 


OO 


1_ 

2% J 


— oo 


F{(0)e im d(0 . 


(7) 


Thus the time function f(t) can be written as an integral over 
angular frequency of the complex exponentials e mt , weighted 
by the value of the transform at that angular frequency, F(co). 
Because the Fourier transform is complex, it can be written 

F(a>) = A(a>)e i * {to) (8) 


1 This derivation also describes the amplitude modulation (AM) transmission 
method used in radio, where the amplitude of the carrier is changed or modulated by 
the envelope, the signal of interest. 


in terms of its magnitude, A(co) - | F(co) |, and phase, 0(g)). 
Thus the Fourier transform represents a time series by two real 
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functions of angular frequency: the amplitude spectrum , A(co), 
and the phase spectrum , (f)(0)). 

The inverse Fourier transform lets us express a displacement 
field u(x 9 t) as an integral over harmonic plane waves of all 
frequencies 


u(x, t)= — A(o) exp i[ot - k(o)x + (j)j(o)]do. 
2:w 


In this formulation, the wavenumber k(co) and the amplitude 
A(o) of each harmonic plane wave are functions of the angular 
frequency. At each angular frequency, the phase 

cp(o) = ot-k(o)x + (p i (o) (10) 

has two parts. The term cot - k(co)x gives the variation in the 
phase due to the propagation of the harmonic wave. Hence, 
as shown in Fig. 2.2-3, the propagation depends on both time 
(cot) and space (k(co)x). Surfaces of constant phase travel with a 
phase velocity 

c(co) = co/k(co) (11) 

that may vary as a function of angular frequency. The other 
phase term, $■(&>), includes effects such as the initial phase of 
the wave when it was generated by a seismic source, which 
depends on the earthquake focal mechanism. 

If the harmonic waves of different angular frequencies mak¬ 
ing up the displacement (Eqn 9) propagate with different phase 
velocities, the velocity at which a wave group propagates 
differs from the phase velocity at which individual harmonic 
waves travel. To find the group velocity of energy propagation 
in the angular frequency band between co Q - A co and co 0 + A o, 
we first approximate the wavenumber k(co) by the first term 
of a Taylor series about co 0 , 


k(co) ~ k(COr 


(co - co n ). 


Substituting Eqn 12 in the inverse Fourier transform (Eqn 9) 
shows that the displacement due to harmonic waves with angu¬ 
lar frequencies near co 0 can be approximated by 


A(co) exp \ i\(co - co 0 ) t 


+ (co 0 t - k(co 0 )x) + d-( o)) dco. 


The argument of the exponential has three terms, the first 
two of which describe traveling waves. The second term, ( co 0 t - 
k(co 0 )x ), describes a wave with average angular frequency 0) 0 
propagating at the phase velocity c(co Q ) = co 0 lk(co 0 ). By con¬ 
trast, the first term describes a wave group with average angu¬ 
lar frequency co Q propagating at a group velocity U(co 0 ) given 
by the condition that 


remain constant, so 


U(o) Q ) = 


dco! C 0 q j dco! co 0 


If the signal has energy over a wide range of angular frequen¬ 
cies, similar expansions for each angular frequency band give 
the group velocity as a function of angular frequency 


Although the group velocity can always be defined by Eqn 17, 
it does not always yield the velocity of energy propagation as 
a function of angular frequency. For example, if the wave- 
number is a very rapidly varying function of angular frequency, 
then using only the first two terms in the Taylor series (Eqn 12) 
may not be adequate, and Eqn 17 may yield negative group 
velocities. In this case, the group velocity is no longer a useful 
concept. Fortunately, these approximations are generally valid 
for seismic surface waves. 

At any angular frequency, the group velocity is related to the 
phase velocity by 


A(co) exp i cot - k(co 0 )x 


+ tp { (o) do. 


Adding and subtracting o a t and regrouping gives 


(o - o 0 )x 


It is sometimes easier to think in terms of wavelength, restating 
Eqn 18 as 

U = c-X—. (19) 

dX 

If a wave is not dispersive, different wavelengths travel at the 
same phase velocity, so dcldX = 0, and the phase and group 
velocities are equal. 
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For a dispersive wave, such as the Love wave in the previous 
section, the group velocity can be found from the dispersion 
relation. If the dispersion relation is 

f(co,k) = 0, (20) 

then the change in f for a small change in co and k is given by the 
Taylor series, 


f(co+dco, k + dk) = f(co, k) + 


V/ da + K 

dco/ k dk. 


dk. 

CO 


( 21 ) 


Because co and k define a mode, they satisfy the dispersion rela¬ 
tion, f(co, k) = 0. If co+ dco , k + dk , is also a solution, then f(co + 
dco , k + dk) must also be zero, so the group velocity is given by 





\ 



( 22 ) 


2.8.3 Surface wave dispersion studies 

It is useful to distinguish two types of dispersion. The familiar 
case is that of light, where the different frequencies travel 
through material such as a lens or a prism at different speeds. 
This phenomenon, known as physical dispersion , occurs in the 
earth but is a small effect (Section 3.7). In seismology, a more 
significant effect is that shown for Love waves in the previous 
section, where the apparent velocity along the surface varied 
with frequency although the intrinsic shear wave velocity in the 
layer and the halfspace did not. This type of dispersion, called 
geometrical dispersion , is noticeable and is frequently studied 
for surface waves. Because for surface waves the horizontal ap¬ 
parent velocity, c x , and wavenumber, k x , vary with frequency, 
these are sometimes written simply as c and k. Similarly, we 
usually speak of “phase velocity” or “group velocity” when we 
mean horizontal apparent phase or group velocity. 

Figure 2.8-2 illustrates phase and group velocity curves for 
the fundamental mode Love wave in the layer over a halfspace 
geometry of the previous section. Although the phase velocity 
increases monotonically with period, as longer period waves 
“feel” the halfspace velocity, the group velocity curve has a 
minimum. This minimum occurs at a period (about 15 s) where 
the slope of the phase velocity curve becomes very steep. This is 
because, by Eqn 19, U decreases when the dispersion term del 
dX becomes large. 

The fact that the surface wave velocities vary depending on 
the depth range sampled by each period makes surface wave 
dispersion valuable for studying earth structure. These stud¬ 
ies are conducted both with Love waves, whose dispersion 
depends on the shear velocity, and Rayleigh waves, whose dis¬ 
persion depends on both the compressional and the shear 
velocities. 

Both phase and group velocity dispersion measurements are 
used. Group velocities are easier to measure because they are 



Fig. 2.8-2 Fundamental mode Love wave phase and group velocities 
for a model of the continental crust and mantle, a 40 km-thick layer with 
j3 1 =3.9 km/s, p t = 2.8 g/cm 3 underlain by a halfspace with fi 2 = 4.6 km/s, 
p 2 = 3.3 g/cm 3 . The group velocity has a minimum where the phase 
velocity curve becomes steep, as longer-period waves sample more of 
the velocity in the underlying halfspace. 


the velocities at which a wave group visible on a seismogram 
travels. As shown by the Love waves in Fig. 2.8-3, the period 
can be measured from the time between successive peaks or 
troughs. Generally, the waves with longest periods travel fast¬ 
est, and therefore appear first on seismograms. The group velo¬ 
city is found by dividing the distance between the source and 
the receiver by the travel time of the wave group. Hence the 
wave group with a period of about 45 s arrived about 1145 s 
after the earthquake, and thus has a group velocity of about 
3.7 km/s (4200 km in 1145 s). The later-arriving wave group 
with a period of about 35 s has a group velocity of about 
3.6 km/s (4200 km in 1170 s). This method can be applied in 
a more sophisticated way by using the Fourier transform of 
a seismogram to isolate wave groups of different periods 
(Fig. 2.8-4). When the original record (top) is filtered at a suc¬ 
cession of narrow frequency bands, energy is seen arriving at 
different group velocities. 

To use such data, the results are typically plotted as a func¬ 
tion of period and are compared to theoretical dispersion 
curves for different structures. For example, the group velocit¬ 
ies for the seismogram in Fig. 2,8-3 are lower than predicted 
for the simple structure in Fig. 2.8-2. A better fit to the data is 
obtained for a model with lower layer and half space velocities. 

This example illustrates a theme that we will encounter 
repeatedly: using seismological observations at the earth’s 
surface (in this case dispersion curves), to study the velocity at 
depth. As noted in Section 1.1.2, this is an inverse problem, in 
contrast to the forward problem of predicting the observations 
expected for a given velocity structure. Although solving the 
forward problem is straightforward, it can be more difficult to 
find a model or models consistent with the observations. For 
the moment, we assume that such a model can be found, if only 
by trial and error, and defer more detailed discussion until 
Chapter 7. 

Dispersion data are used to study more complicated velocity 
structures. Figure 2.8-5 shows the observed dispersion curves 
and inferred S-wave velocity structure for a study of the Walvis 
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Love waves from California earthquake recorded in New York 




Fig. 2.8-3 Top: Love waves from an earthquake off the coast of 
California, recorded on the transverse component at station RSNY in 
New York, 4200 km away. Triangles indicate successive peaks and 
troughs of the waveform. Bottom: Observed (dots) and predicted 
(top line) group velocities for the reference structure in Fig. 2.8-2. 

The data are better fit by the predicted velocities (lower line) from a 
model with a 40 km-thick layer with shear velocity 3.6 km/s, overlying 
a halfspace with velocity 4.4 km/s. 

ridge, a linear elevated region in the South Atlantic. There are 
noticeable group velocity differences between two paths from 
an earthquake on the Mid-Atlantic ridge, one along the Walvis 
ridge and one off the ridge. For periods greater than about 20 s 
the off-ridge path is faster, indicating the presence of higher- 
velocity upper mantle material to a depth of about 45 km. This 
difference may reflect the processes that formed the Walvis 
ridge, which is thought to have been generated by a hot spot 
(Section 5.2.4), a fixed source of magma beneath the Mid- 
Atlantic ridge. 

For periods less than about 50 s the group velocity increases 
with period, because the longer periods sample material whose 
velocity increases with depth. By contrast, for periods greater 
than about 50 s, the group velocity decreases with period. 
This decrease is interpreted as evidence for a low-velocity zone 
beneath the higher velocity “lid.” The surface wave data thus 



Group velocity (km/s) 

Fig. 2.8-4 Love wave group velocity dispersion shown by a seismogram 
from a Mongolian earthquake recorded in Japan (top). The data are 
filtered around five successive periods. Longer period energy arrives 
earlier, showing higher group velocity. (Kanamori and Abe, 1968.) 


provide evidence for the idea that the mechanically strong and 
cold (hence higher-velocity) plates of the earth’s lithosphere 
are underlain by a low-velocity zone (Section 3.5.3) where 
temperatures approach the melting point of rock (Section 
3.8.2). 

Earth structure is also studied using phase velocities. These 
are more difficult to measure than group velocities, because 
they are defined for harmonic waves of a single frequency. 
Taking the Fourier transform of a seismogram yields the phase 
at each angular frequency, <&(&>). We assume that this phase, 
on a seismogram recorded at a distance x from an earthquake 
at time t after the earthquake, has three terms 

€>(&>) = [cot-k{co)x] +( 1 )^ 0 ))+ 2njz 

= [&)£— coxfc(co)} + (j)j{co) -\-2wz. (23) 

The cot - k{co)x term is the phase due to the propagation of the 
wave in time and space. The ^(co) term includes the initial 
phase at the earthquake and any phase shift introduced by the 
seismometer. The final term, 2nn , reflects the periodicity of the 
complex exponential, because adding an integral multiple of 
27Tto the argument yields the same value. 

The phase velocity can be found from observations in two 
ways. One method uses seismograms recorded at two stations, 
at distances x 1 and x 2 from an earthquake. If the waves arrive 
at times t x and f 2 , taking the Fourier transform at each station 
gives the phase as a function of angular frequency: 
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Fig. 2.8-5 Rayleigh wave group velocity study of crust and upper mantle structure along the Walvis ridge. Left : Ray paths from an earthquake on the Mid- 
Atlantic ridge. The path to station SDB is along the Walvis ridge, whereas the path to station WIN is similar, but off the ridge. Center. Dispersion curves 
for the two paths. Right: Inferred shear wave velocity structure for the two paths, showing lower velocities along the ridge. {Data from Chave, 1979.) 


= cot 1 - cox^cico) + 4) i {co) + 2nn , 

0 2 (<w) = cot 2 - cox 2 Ic((o) + 0-(ct>) + 2m%. (24) 

We then form the difference 0 21 = 0 2 - O l9 and solve for the 
phase velocity: 

c(co) = ca(x 2 -x 1 )/[co{t 2 -t 1 ) + 2{m-n)K-~-® 21 (cd)]. (25) 

The initial phase is common to both stations, so the </>-(a>) term 
drops out if the seismometers have the same response, and so 
contribute the same phase shift. If the seismometers have dif¬ 
ferent responses, a correction term is added. The 2 (m - n)n 
term is found empirically by ensuring that the phase velocity at 
long periods is reasonable. 

Alternatively, a single-station measurement of phase velocity 
can be made by predicting the phase at the earthquake from its 
focal mechanism (Section 4.3). If 0-(co) is assumed to be known, 
the phase velocity is 

c(w) = cox/[mt+4) i (co) + 2 nit- <$(&>)]. (26) 

Figure 2.8-6 shows an example of using phase velocity data 
to study the evolution of the oceanic lithosphere. Various evid¬ 
ence shows that the oceanic lithosphere cools and thickens 
as it moves away from the spreading ridge where it formed 
(Section 5.3.2). As a result, surface wave velocities depend 
on the age of the lithosphere. Thus the Rayleigh wave phase 
velocity for the two paths shown is slowest for the path to 
TUC, approximately parallel to the East Pacific rise, which 
includes primarily young lithosphere. The other path to ARE, 
which includes older lithosphere, shows higher velocities. Sim¬ 
ilar effects are observed from group velocities. 


Such studies yield an average dispersion curve, and hence 
average velocity along the great circle path traveled by the 
wave. However, the actual structure varies along the path. To 
study the evolution of the lithosphere, we would like to know 
the velocity of the lithosphere at each age. Unfortunately, the 
distribution of earthquakes and seismic stations is such that 
paths between earthquakes and seismic stations are rarely in 
lithosphere of a single age. Instead, we measure surface wave 
velocity on paths including different ages, as in Fig. 2.8-6. 

Determination of the variable velocity structure along a 
path is a complicated inverse problem. The simplest approach, 
known as the “pure path” method, divides the study area into 
regions, in this case regions formed during age intervals, in 
which the velocity at each angular frequency is assumed to 
be constant. We then take a set of paths between individual 
earthquakes and seismic stations, such that the z th path has 
length L-, and determine the phase or group velocity v^co) for 
each path as a function of angular frequency. The total time 
required for the wave to travel the entire path is assumed to be 
the sum of the times required to traverse each of the regions 
along the path. Thus, if path i contains segments of lengths L- 
in each region / with velocity v-( co ), 

n 

L i /^(ffl) = yL i/ /^(ffl). (27) 

/=1 

We find the velocity in each region v-(co) by writing this as a 
vector-matrix equation 

d = Am (28) 

where the matrix A - = L ;7 and the data vector d i - L^v^co) 
are known, and the model vector = 1 Ivj(co) is to be found. 
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Fig. 2.8-6 Application of Rayleigh wave phase velocity data to study 
the evolution of the oceanic lithosphere. Top: Sample paths between 
earthquakes on the East Pacific Rise and seismic stations, which traverse 
lithosphere of various ages, as shown by the isochrons. The hatched 
regions are lithosphere younger than 3 million years. Bottom: Dispersion 
curves for the paths shown. The path to station TUC is through younger, 
hence lower-velocity, lithosphere than the path to ARE. (Data from 
Forsyth, 1975.) 


Typically, because the study area is divided into a number of re¬ 
gions smaller than the number of paths, the number of observa¬ 
tions exceeds the number of model parameters sought. Hence 
the data vector has more elements than the model vector, so the 
matrix A has more rows than columns and cannot be inverted. 
Such overdetermined systems of equations are common in 
seismology, especially in determining earth structure from 
observations. As we will see in Chapter 7, the best solution in a 
least squares sense to such systems of equations is found by 


premultiplying both sides, first by the transpose matrix and 
then by the inverse of A T A, 

m = (A T A)- 1 A T d. (29) 

The results of such an analysis for Rayleigh wave phase 
velocity on many paths crossing the Pacific are shown in 
Fig. 2.8-7. As the lithosphere ages, the velocity and the depth 
to the low-velocity zone increase, presumably due to the 
cooling and thickening of the lithosphere. 

Such studies, on both a global and a regional scale, have 
contributed greatly to our understanding of the earth’s inter¬ 
ior and processes. As we noted, finding velocity structure as 
a function of depth from dispersion data is an inverse problem, 
which exploits the fact that waves of different periods sample 
the structure at depth differently. The pure-path study illus¬ 
trates a more complicated inverse problem, studying variations 
of velocity laterally as well as in depth. Our ability to study 
lateral structure comes from the fact that different source- 
receiver paths sample different regions. Hence these studies 
have the common feature of using observations on the bound¬ 
aries of a region (either laterally or at depth) to learn about the 
structure within it, via observations resulting from sampling 
the region in different ways. Such approaches are examples of 
tomography , which we will discuss in Chapter 7. 

2.8.4 Tsunami dispersion 

Dispersion is also observed for tsunamis, the water waves 
generated by earthquakes that were discussed in Section 1.2.4. 
Tsunamis are like wind-driven water waves, in that they involve 
gravitational potential energy stored by vertical displacements 
of the water. 2 Although the underlying physics of the propaga¬ 
tion differs, there are similarities in the way tsunamis and 
surface waves propagate. 

As shown in Fig. 2.8-8 (left), tsunami dispersion is similar to 
that of Rayleigh and Love waves, in that the waves with longer 
periods travel faster and thus arrive earlier. The dispersion 
relations (Fig. 2.8-8, right ) show two effects that depend on the 
period, and thus on the wavelength. At long periods, where the 
wavelengths are much greater than the ocean depth, d, the phase 
velocities are essentially nondispersive and are given by 

c = ^[gd, (30) 

where g is the acceleration of gravity. Thus tsunami velocit¬ 
ies depend on ocean depth, as shown. However, at shorter 
periods, where the wavelengths are much less than the ocean 
depth and so do not “feel” the ocean floor, the tsunami velo¬ 
cities depend on wavelength as 


2 Although tsunamis are often called “tidal waves,” they have no connection to 
tides. 









100 Basic Seismological Theory 



Period (s) p (km/s) 

Fig. 2.8-7 Left: Rayleigh wave phase velocity dispersion results for five age provinces in the Pacific basin. Right: Shear wave velocity structure derived 
from the data. As the lithosphere ages, the phase velocity and depth to the low-velocity zone increase. (Nishimura and Forsyth, 1989.) 



Time (min) Period (s) 

Fig. 2.8-8 Left: Tide gauge record of the tsunami at Hilo, Hawaii, from the great 1960 Chilean earthquake. Dispersion is seen, with the longer-period 
waves arriving first. (After Eaton etal, 1961. © Seismological Society of America. All rights reserved.) Right: Theoretical tsunami dispersion curves for 
group ( U) and phase (C) velocities for different ocean depths. At longer periods the velocity is roughly constant and controlled by the ocean depth, whereas 
at shorter periods, where the tsunami waves do not reach to the bottom, the velocities vary with period. (Ward, 1989.) 


c = (Xg/27c) m , (31) 

so shorter-period waves travel more slowly. 

Like surface waves, tsunamis travel across the earth’s sur¬ 
face, so their amplitudes decay roughly according to 1/^fr due 
to two-dimensional spreading. However, applying Snell’s law 
to their horizontal propagation shows that the paths of surface 
waves and tsunamis deviate from the shortest great circle path 
if there are large lateral velocity variations. This effect, called 
multipathing because waves arrive at a receiver from several 


directions, can cause large changes in the waves’ amplitudes 
due to the effects of focusing and defocusing (Section 3.7.3). As 
a result, the amplitude variations can be inferred from the con¬ 
centration of ray paths that left the source uniformly spaced. 
Denser paths show rays focusing and increasing amplitudes, 
whereas sparser paths indicate defocusing and lower ampli¬ 
tudes. Figure 2.8-9 shows focusing and defocusing for the 
tsunami in Fig. 2.8-8 (left), due to variations in ocean depth. 
We will also use this method to study body wave amplitudes in 
Chapter 3. 
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Fig. 2.8-9 Ray paths for the tsunami in Fig. 2.8-8 {left). Tick marks show 
the travel times in increments of hours. Variations in ocean depth, and 
therefore in tsunami velocities, cause multipathing that results in large 
variations in amplitudes. (Woods and Okal, 1987. Geophys. Res. Lett., 
14,765- 8, copyright by the American Geophysical Union.) 


2.9 Normal modes of the earth 

2.9.1 Motivation 

We started this chapter (Section 2.2) by considering the motion 
of a string that resulted from applying a force, and saw that the 
displacement could be viewed in two ways: either as waves 
propagating along the string or as the sum of standing waves, 
called normal modes. Both of these descriptions came from 
applying Newton’s second law of motion, and are equivalent 
because all the features of wave propagation, such as the velo¬ 
cities and amplitudes of the reflected and transmitted waves, 
come out the same. This concept, called mode-wave duality , 
is useful in seismology because the two formulations provide 
different insights and jointly lead to deeper understanding. 
Neither formulation is more “real” — both are mathematical 
ways of representing the displacement, which is the physical 
quantity. 

In a similar way, we end this chapter by extending the duality 
to the three-dimensional earth. We discuss how all body and 
surface waves can be described as the sums of the normal 
modes, also called free oscillations, of the spherical earth. 
These sums yield not only the reflections and transmissions 
from all boundaries, but also waves produced by effects like 
diffraction that are difficult to model because geometric optics 
fails (Section 2.5.10). However, when we discuss seismological 
investigations of earth structure in Chapter 3, it will turn out 
that most studies do not use a normal mode approach, for two 


reasons. First, normal mode calculations are more complicated 
than those for rays and plane waves. Second, by representing 
all seismic waves simultaneously, mode solutions do not select 
specific seismic phases. Hence a phase like ScS emerges from 
a computation summing many modes, whereas simpler ray or 
plane wave calculations often directly give the information (for 
example, travel times and amplitudes) that we seek. However, 
there are applications in which modal solutions are useful, 
making the topic worthy of study for reasons beyond its phys¬ 
ical elegance, although the latter may well be what draws many 
seismologists (ourselves included) to it. 

2. 9.2 Modes of a sphere 

The earth’s modes show many features seen for the one¬ 
dimensional string, so we begin by recalling some basic results. 
We saw in Section 2.2.5 that once a one-dimensional string is 
excited, its motion can be described as 

oo 

u{x, t) = X A n u n {x, co n ) COS {(Dj), (1) 

n =0 

which is the sum of standing waves or eigenfunctions , 
U n (x, co n ), each of which is weighted by the amplitude A n 
and vibrates at its eigenfrequency co n . The eigenfunctions and 
eigenfrequencies depend on the physical properties of the 
string, whereas the amplitudes depend on the position and 
nature of the source that excited the motion. We saw that 
eigenfunctions that satisfy the wave equation in one dimension 
are sine and cosine functions. For a homogeneous (uniform) 
string of length L and velocity v, the boundary conditions of 
zero displacement at the fixed ends require that 

UJx, coj = sin ( nnxtL) - sin ( m n xtv ), (2) 

so the eigenfrequencies are 

co n =n7tv/L. (3) 

Because the frequency, velocity, and wavelength of a traveling 
wave are related by (Q = 2 nv!X (Section 2.2.2), Eqn 3 requires 
that L = nX!2 , so each spatial eigenfunction has an integral 
number of half wavelengths along the string. A finite string can 
vibrate only in these discrete modes, which satisfy the bound¬ 
ary conditions. The eigenfrequencies are spaced nv!L apart in 
frequency, so if the string were infinite, the eigenfrequencies 
would be continuous rather than discrete. Finally, we saw that 
the amplitudes depend on the value of the eigenfunction at the 
point where the source excited the motion. 1 

1 Representing the displacement as a sum of sines and cosines, where the eigen¬ 
functions have discrete eigenfrequencies, corresponds to a Fourier series, whereas a 
continuous distribution of eigenfrequencies corresponds to a Fourier transform. We 
use both concepts informally as needed throughout the text, and develop them more 
formally in Chapter 6. 







Additional insight into the earth’s modes comes from the 
two-dimensional problem of Love waves in a layer over a 
halfspace (Section 2.7.3). The medium was semi-infinite, 
extending vertically from the surface to all depths, and hori¬ 
zontally in both directions. We wrote a solution of the wave 
equation in both the layer and the halfspace as the product of 
separate terms describing the vertical and horizontal behaviors. 
We then used boundary conditions of zero traction at the free 
surface, continuity of traction and displacement at the inter¬ 
face, and energy decaying away from the interface downward, 
and found that these conditions require that Love waves have 
discrete eigenfrequencies that depend on the thickness of the 
layer and the shear velocity of the layer and the halfspace. Each 
of these eigenfrequencies thus corresponds to a vertical and 
horizontal eigenfunction. Interestingly, the eigenfrequencies 
form discrete overtone branches (Fig. 2.7-9), so that for a given 
apparent velocity there are several possible eigenfrequencies. 
Because the medium is two-dimensional, we need two para¬ 
meters to list all the eigenfrequencies. One parameter, the 
overtone number, varies discretely (0, 1, 2, ... ) because the 
thickness of the layer gives a discrete dimension. The other para¬ 
meter, the frequency, varies continuously along an overtone 
branch, because the horizontal dimension is infinite. 

To extend one- and two-dimensional ideas to wave propaga¬ 
tion in the three-dimensional spherical earth, we formulate the 
normal mode solution in spherical coordinates (Section A.7). 
Because waves propagate away from the seismic source, we put 
the pole of the coordinate system there (Fig. 2.9-1). We then 
write the displacement vector u(r, 0 , <j)) = ( u r , u e , u^ that satis¬ 
fies the equation of motion (Eqn 2.4.10) as a function of radius 
r and surface position (9, 0). A slight linguistic complication is 
that in spherical coordinates the radial direction is the vertical, 
whereas for plane waves the term “radial” (Fig. 2.7-2) denotes 
the horizontal direction in the vertical plane containing the 
source and the receiver. In this spherical geometry, u e is in the 
direction analogous to that of plane wave propagation, and u^ 
is transverse to it. 

By analogy to the string (Eqn 1), we write the displacement 
as a normal mode sum 

u(r, e, 0 = X (4) 

n l m 

Because the medium is three-dimensional, each mode is de¬ 
scribed by its radial (depth) order «, and two surface orders / 
and m. All three indices have discrete integer values, because the 
earth is a finite body. The eigenfrequency depends on all three, 
and the spatial behavior is described by a radial (or vertical) 
eigenfunction n 7 ;(r), which is a scalar, and a surface eigen¬ 
function xj”(<9, <p), which is a vector. The sum depends on the 
weights for each eigenfunction, „A"*, which are excitation 
amplitudes that depend on the seismic source. Thus a mode’s 
displacement varies along the earth’s surface depending on 
both the excitation of that mode and the location relative to 
the source, which combine to control the value of the surface 


Source 

u r 

Receiver 


*2 


*i 

Fig. 2.9-1 Spherical coordinate geometry for normal modes. The 
earthquake source is at the pole, so at a receiver the radial displacement 
component u r is vertical, u e is in the horizontal direction in the vertical 
plane containing the source and the receiver, and is in the transverse 
direction. 

eigenfunction. As with modes on a string, we can think of the 
displacement as a vector in a vector space (Section A.3.6) 
whose basis vectors are the eigenfunctions, which are weighted 
and combined to describe the displacement. 

Although Eqn 4 seems abstract, it turns out to be useful. If 
we take the Fourier transform of a long seismogram, which 
might extend for days or even weeks following a great earth¬ 
quake, we find that the amplitude spectrum 2 (Eqn 2.8.8) is made 
up of normal modes that appear as peaks at certain distinct 
frequencies (Fig. 2.9-2). Hence thinking about a seismogram as 
a sum of modes gives additional insight into its nature. 

Separating the radial and surface eigenfunctions in the nor¬ 
mal mode sum (Eqn 4) has interesting consequences. The earth 
is close to being spherically symmetric (sometimes termed 
laterally homogeneous), because its structure varies much 
more with depth than it does laterally at a given depth. By ana¬ 
logy to Love waves, we expect the surface eigenfunction to be 
an analytic form related to the wave equation. Moreover, if 
the earth were laterally homogeneous (as assumed in our Love 
wave example), the surface eigenfunction would not affect the 
eigenfrequency. Thus, for a laterally homogeneous earth, we 
can write the eigenfrequencies as n cof = We will see later 
that this useful approximation also assumes that the earth is 
perfectly spherical and not rotating. 

The eigenfrequency depends on the radial eigenfunction, 
which is found by solving the equation of motion in the spher¬ 
ical earth subject to boundary conditions at different depths. 
Although the boundary conditions (continuity of stress and 
tractions) do not sound unduly formidable, they turn out to be 
complicated because the tractions involve stresses and hence 

2 As discussed in Section 6.2, the amplitude spectrum is the magnitude of the Fourier 
transform, and its square shows how much energy is present at different frequencies. 
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Fig. 2.9-2 Amplitude spectrum of the 
radial component of a 35-hour seismogram 
following the great June 9,1994, deep focus 
Bolivia earthquake, recorded at Pasadena, 
California. Many peaks are labeled with 
several modes, indicating coupling between 
modes of similar frequencies. The solid line 
is the observed spectrum, and the dashed line 
is the spectrum predicted by a three- 
dimensional earth velocity model. (Dahlen 
and Tromp, 1998. Copyright © by Princeton 
University Press. Reprinted by permission of 
Princeton University Press.) 
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the gradients of displacements. As noted in Section A.7.4, gra¬ 
dients in spherical coordinates require taking the derivatives of 
the unit basis vectors that vary with position, unlike those in 
Cartesian coordinates that always point the same way. Thus 
we leave the problem of finding the radial eigenfunctions, and 
hence the eigenfrequencies, for advanced texts, just as we did 
for a string and for surface waves. As a result, we will also not 
address the issue of computing the excitation, which depends 
on the radial eigenfunctions at the source depth. 


2.9.3 Spherical harmonics 

The surface eigenfunctions are based on spherical harmonics , 
functions often used to expand a function on the surface 
of a sphere, much as sines and cosines are used in Cartesian 
coordinates. Because we use the seismic source as the pole, 
6 is the angular distance from the pole, or colatitude, and (j) 
is the azimuth around the pole, or longitude (Fig. 2.9-1). 

The angular variations are described by a set of functions 
called Legendre polynomials , which are indexed by the degree , 
or angular order , /, 


Pfx) = 


2 l l\ dx l 


(X 1 - 1)'. 


(5) 


The first several polynomials are 

p 0 (*) = l, P 1 {x)=x, F 1 {x) = (V2)(3x 1 -\), 

P 3 (x) = (1/2)(5x 3 -3x), (6) 

and some examples are shown in Fig. 2.9-3. For a sphere, 
x = cos 6 , so x ranges from -1 < x < 1. Legendre polynomials 



Fig. 2.9-3 Examples of Legendre polynomials for the interval 0-vrused to 
describe the displacements associated with normal mode oscillations. 


are orthogonal over this interval, and so are a suitable basis set 
for describing the angular variations. 

The azimuthal variations are included by forming the associ¬ 
ated Legendre functions , 


Pf{x) 


/i J2\m!2 Jl+m 

(1 * > (* 2 -i)' 

2‘n J dx l+m 


(7) 


where the azimuthal order, m, varies over -l < m < /. The 
azimuthal functions e' m * and associated Legendre functions 
are combined to give the fully normalized spherical harmonics, 
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Fig. 2.9*4 Examples of spherical harmonics. Y® (left) is a zonal harmonic, 
the real part of Y 3 ( middle) is a sectoral harmonic, and the real part of Y\ 
(right) is a tesseral harmonic. (After Lapwood and Usami, 1981, reprinted 
with permission of Cambridge University Press.) 


2.9.4 Torsional modes 

Using spherical harmonics, we can write the normal modes of 
a sphere (Eqn 4) explicitly. You may recall that in Cartesian 
coordinates we separated the displacements into P-SV and SH 
motions, which are decoupled in the sense that they propagate 
independently in a medium whose properties vary only in 
depth along the plane containing the source and the receiver 
(Section 2.5.2). In spherical geometry, we do a similar decom¬ 
position with normal modes. 

Analogous to SH waves, we have torsional , or toroidal , 
modes. Their surface eigenfunctions are given by the vector 
spherical harmonics with (r, 0 , (p) components 


Yf(0, <p) = (-1 Y 


( 21 + 1)'' 

(l — m)l 

A 4n ) 

(l + m)\ 


1/2 


Pf( cos 0)e im t 


( 8 ) 


'~vm 
1 / 


0, 


1 3Yf(e, 0) -dYne, 


sin 6 


30 


( 10 ) 


Spherical harmonics are always defined with the Pf(c os 9)e im $ 
term, but various normalizing factors are used in the literature. 

The angular variations from 0 to n are either symmetric 
(when l + m is odd) or antisymmetric (when l + m is even) about 
the equator (6- k/2). The azimuthal variations are periodic 
(<p + 2%- (p). Because spherical harmonics are generally com¬ 
plex functions, we can plot their real or imaginary parts over 
the sphere (Fig. 2.9-4). The angular order, /, gives the number 
of nodal lines on the surface. If the azimuthal order m is zero, 
the nodal lines are small circles about the pole. These are 
called zonal harmonics, and do not depend on (p (i.e., they are 
symmetric about the pole at 0 = 0). The other extreme is for 
m = /, where all the surface nodal lines are great circles 
through the pole. These are called sectoral harmonics. When 
0 < | m | < /, there are combined angular and azimuthal 
(colatitudinal and longitudinal) nodal patterns called tesseral 
harmonics (Fig. 2.9-4). 

Spherical harmonics are orthogonal, 

2 71 n 

sin 0 Yf(0 , 0) Y?[0, (p)dQd(p = 8 ri 8 m , m , (9) 

0 0 

so that the integral of the product of one with the conjugate 
of another over the sphere is zero. 3 The spherical harmonics 
therefore form an orthogonal set of basis vectors that can be 
used to expand any function on the surface of a sphere, much 
as we used sines for the string (and would do so for any 
other Cartesian coordinate problem). Spherical harmonics 
are used to represent planetary quantities, including lateral 
variations in seismic velocity, surface topography, and gravita¬ 
tional and magnetic fields. The shape of the field represented 
depends on the amplitudes of the different spherical harmonic 
components. 

3 As defined in Eqn A.3.37, 8 nm = 0 unless n = m. 


The vector spherical harmonics are vectors whose components 
contain derivatives of spherical harmonics, which arise because 
the equation of motion involves spatial derivatives of the 
displacements. 

The displacement vector u = (w r , u Q , u^) that corresponds to 
torsional modes is 

l 

u T (r, e, t) = X I X „Ar A,MTr(0, 0)e'"“ rt . UD 

n l m=-l 

The radial eigenfunction n Wfr) varies with depth, even though 
the resulting displacement has no radial component because u r 
is always zero. Thus torsional modes have only horizontal 
displacements and are analogous to SH waves. Similarly, their 
divergence is zero, so they cause no volume change. 

Torsional modes are denoted M Tj m , where n is the radial 
order, / is the angular order, and m is the azimuthal order. 
For given radial and angular orders, the 2/ + 1 modes of dif¬ 
ferent azimuthal orders -l <m < l are called singlets , and the 
group of singlets is called a multiplet . If the earth were perfectly 
spherically symmetric, and not rotating, then all the singlets in 
a multiplet would have the same eigenfrequency. This condi¬ 
tion is called degeneracy. For example, the period of n Tf would 
be the same for ^Ty 1 , „Tjf 2 , „Ty 3 , etc. In the real earth, the 
singlet frequencies vary, which is an effect called splitting. 
However, the splitting is small enough that for most applica¬ 
tions we ignore it, dropping the m superscript and referring 
to the entire n Tf multiplet as n T^ with eigenfrequency n <x>i. 

For torsional modes, the horizontal displacements, w 0 and u^ 
are zero along nodal lines, because the angular displacements 
u e vanish where dYf/dtp- 0 and the azimuthal displacements u^ 
vanish where dYf t ld0= 0. For example, consider the lowest- 
frequency (longest-period or gravest) torsional normal mode 
singlet, (Fig. 2.9-5). There are no radial motions, and the 
angular displacements are always zero, because m = 0. To see 
this, note, from Eqn 10, that u e is proportional to 




—-—P 2 (cos 6) — (e im<l) ) = ——Pj(cos 6){im)e im(l> = 0. (12) 

sin 0 30 sin 0 

The only nonzero displacement component is the azimuthal 
one, Up which is proportional to 

e im<p JL p0( CQS 0) = 3 s j n 0 cos Q (13) 

30 

The azimuthal motions vanish at the poles (0= 0° and 180°) 
and at the equator (0 = 90°). The motions are in opposite direc¬ 
tions across the equator because sin 0 is an odd function. This 
node is the surface expression of a nodal plane that bisects the 
earth along the equator. The pattern of oscillations extends 
throughout the mantle. 4 

The radial order describes how the mode varies with radius, 
and the angular and azimuthal orders describe how it varies 
with latitude and longitude. For torsional modes, n gives the 
number of spherical nodal surfaces within the earth. If n = 0, 
there are no nodal surfaces, and the direction of motion at a 
given latitude and longitude is the same at all depths. For tor¬ 
sional modes, / equals one more than the number of nodal lines 
on the surface. The shape and distribution of these nodal lines 
varies according to the azimuthal order, m, which gives the 
number of vertical nodal planes that bisect the earth, passing 
through the pole. For m — 0, the nodal lines are small circles 
about the pole. If m = l - 1, the nodal lines are great circles 
through the pole. 

The 0 T\ singlet has a longitudinal great circle node at the 
surface (Fig. 2.9-6). The motions are shear displacements 


4 Because the outer core is liquid, the core-mantle boundary is a free surface for tor¬ 
sional modes excited by earthquakes. These modes do not propagate into the outer 
core, and therefore never reach the inner core, which theoretically has its own set of 
torsional modes. 
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Fig. 2.9-6 Examples of the displacements for several torsional modes. The 
examples for jT® and jTj schematically show the variation with depth. 


about the pole that oscillate toward and away from the nodal 
plane. The period of 0 T 2 is 44 minutes: 22 minutes rotating 
in one direction, then 22 minutes rotating back again. For 
higher angular orders /, more nodal planes occur. qT^ has two 
latitudinal nodal lines at the surface, 0 T \ has one, and 0 T \ has 
none. As / increases, the number of divisions of the surface 
increases. 

Torsional modes with n = 0 ( 0 Tj”) are called fundamental 
modes , and have motions at depth in the same direction as at 
the surface. This is not true, however, for modes with n > 0, 
called overtones. As shown in the cutaway for jT-j, there 
is a spherical nodal surface within the mantle across which 
displacements reverse. We will see shortly that an overtone 
of order n has n radially symmetric nodal surfaces at depths 
determined by the velocity structure of the mantle. 

You may have wondered what happened to 0 T 1 and 0 T 0 . 
Because the number of nodal planes equals / - 1, 0 T 1 has no 
nodal planes. Physically, this corresponds to rigid body rota¬ 
tion. As we will discuss in Section 4.4.4, seismic waves gener¬ 
ated by earthquakes are generally well described by treating the 
source as a double couple of body forces, which generates no 
net torque, and therefore no change in rotation. In rare cases, 
giant earthquakes may cause enough vertical displacement of 
rock to affect the rate of the earth’s rotation. Ffowever, because 
torsional modes do not involve radial motions, even in these 
cases conservation of angular momentum demands that 0 T t 
be zero. There are, however, overtones with l = 1 (fT l9 2 T ls 
etc.). These involve the entire top spherical shell of the earth 
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oscillating in one direction, with deeper shells oscillating in 
opposing directions. The mode 0 T 0 has no physical meaning 
and is undefined. 

2.9.5 Spheroidal modes 

P-SV motions are described in a similar way by spheroidal 
modes, also known as poloidal modes. These are more complic¬ 
ated than torsional modes, because they combine radial and 
transverse motions. The surface eigenfunctions are given by two 
other vector spherical harmonics , with (r, 0, 0) components 


(Y/ m »0,0), 


s 


m _ 

/ 


J 

0, 

\ 


drm <t>) 

do 


i aYHM) ' 

sin 0 d(p 


(14) 


Each corresponds to a different radial eigenfunction, M U ; (r) 
and „V / (r), so the displacement vector u = (u r , u e , u^) for 
spheroidal modes is 

/ 

u s (r, M) = XX I n An n U,(r)RT(e, 0 ) + 

n l m=-l 

(15) 


Thus the radial eigenfunction w U ; (r) corresponds to radial 
motion, and B V)(r) corresponds to horizontal motion. 

To see that the mode formulation separates P-SV from SH 
and fully represents the displacement in three dimensions, note 
that the three vector spherical harmonics are orthogonal, 


'■pm . gw _ p’w . jpm _ gm 


•Rf^O. 


(16) 


Spheroidal modes n Sj” follow a similar nomenclature as tor¬ 
sional modes. The fundamental modes, with no internal nodal 
surfaces, are described by n = 0. As n increases, the number of 
internal nodal surfaces increases, although, unlike for torsional 
modes, n is not the number of nodal surfaces. The angular 
order / equals the number of nodal lines at the surface (rather 
than / - 1 for torsional modes), and m represents the number 
of great circle nodal lines passing through the pole. The spher¬ 
oidal radial modes, which have / = 0 and thus only radial 
motions, have no torsional analogue. 

Some examples of spheroidal modes are shown in Fig. 2.9-7. 
The “breathing” mode 0 5 0 involves radial motions of the entire 
earth that alternate between expansion and contraction. The 
gravest (lowest-frequency or longest-period) of earth’s modes 
observed to date is 0 S 2 , which has a period of 3233 s, or 
54 minutes. 5 The 0 S 2 singlet alternates between an oblate (flat 
disk) and prolate (football) shape, and is accordingly referred 
to as the “football” mode. Displacements for the 0 S 2 and 0 S 2 


5 The 1 5 1 Slichter mode due to lateral sloshing of the solid inner core through the 
liquid iron outer core, which has yet to be observed, should in theory have a period of 
about 5.5 hours. 



Fig. 2.9-7 Examples of the displacements for several spheroidal modes. 


singlets are also shown. There is no 0 5j mode, which would 
correspond to a lateral translation of the planet. Increasing l 
results in more surface nodal lines, as shown for and 
increasing n results in more internal nodal surfaces. 

2.9 .6 Modes and propagating waves 

We can gain considerable insight into normal modes by con¬ 
sidering their relation to traveling waves. To do this, we use 
a mathematical approximation (that we will not derive) for the 
associated Legendre functions. When the angular order 
/ is much greater than the azimuthal order m, 

Pf (cos 6) * (-l) m / w (2//^sin 0) 1/2 cos [(/+ 1/2)0 

+ w7r/2-7r/4)], (17) 

so the spherical harmonics behave approximately like 
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A few minutes after A few hours after 

the earthquake the earthquake 


Fig. 2.9-8 Cartoon of the equivalence of surface waves and normal 
modes. Once surface waves from an earthquake make multiple passes 
around the earth, they can be viewed as standing waves, or normal modes, 
such that the mode with angular order / has / + 1/2 wavelengths around 
the earth. This example is for 0 ^ 25 - 


Yf(9,d)-A(2///rsin <9) 1/2 cos [(/+ l/2)0]e im *, (18) 

where A contains the remaining factors. Using this approx¬ 
imation and representing the cosine as complex exponentials 
shows that terms in the mode sums (Eqns 11 and 15), which 
involve the products Yf{9, (p)e l n co Tf give rise to terms corres¬ 
ponding to propagating waves with horizontal wave vector 
(Section 2.4.2) 

K = (k e ,k^ k e =(t/a) [(/ +1/2) 2 - ra 2 /sin 2 9] 1/2 , 
k^-mfa sin 0), (19) 

where the factor of the earth’s radius a converts the angular 
terms to wavenumbers along the surface. Hence the mode with 
angular order / and frequency n co l corresponds to a traveling 
wave with horizontal wavelength 

X x = 2k/\ k x \ = 2ml(l+ 1/2) (20) 

that has / + 1/2 wavelengths around the earth (Fig. 2.9-8). 
These waves travel at a horizontal phase velocity 

c x =^AK\=n^,^i+m). ( 21 ) 

This equivalence is easily visualized a while after an earth¬ 
quake, where globe-circling surface waves can be viewed as 
standing waves, or modes. Waves corresponding to different 
singlets propagate in different directions, as shown by the 
various values of m. 

This approximation also gives insight into the correspond¬ 
ence between spheroidal and torsional modes and P~SV and 
SH waves (or Rayleigh and Love waves). The spheroidal and 
torsional mode displacements depend on vector spherical har¬ 
monics, and thus on the derivatives of spherical harmonics. 


Taking derivatives of Eqn 18 shows the ratio of the partial 
derivatives, 

dYf(e^) j dYT(e^) _ 1 (22) 

d9 / d(j) 

because / was assumed to be much greater than m. For 
torsional modes, the T f 1 vector spherical harmonic (Eqn 10) 
generally has a (j> component greater than its 0 component, so 
its displacement is primarily perpendicular to the plane con¬ 
necting the source and the receiver, like an SH or a Love wave 
(Fig. 2.9-1). By contrast, the spheroidal mode vector spherical 
harmonic S f 1 (Eqn 14) generally has a 0 component greater 
than its (j) component, and so causes displacement primarily in 
the plane connecting the source and the receiver, like a P-SV or 
a Rayleigh wave. 

We can use these ideas to relate modes to specific body and 
surface wave phases. A good place to start is to recall that for 
Love waves in a layer over a halfspace, the boundary condi¬ 
tions at the free surface and the interface require that the Love 
wave have discrete eigenfrequencies that depend on the layer 
thickness and the shear velocity of the layer and the halfspace. 
We thus obtain a dispersion relation (Section 2.7.3) giving the 
phase velocity as a function of frequency for these modes. 
Because the dispersion relation depends on the earth structure 
assumed in computing it, we can compare the observed dis¬ 
persion of surface waves to the predictions of different earth 
models, and invert the observations to derive earth models that 
better fit the data (e.g., Fig. 2.8-3). 

Analogous computations for the spherical earth predict 
the normal mode eigenfunctions and eigenfrequencies, which 
depend on the earth model assumed. Figure 2.9-9 shows a plot 
of radial eigenfunctions for some modes. As for surface waves, 
modes with different eigenfrequencies sample different depths 
within the earth. For example, as noted in Fig. 2.9-6, Fig. 2.9- 
9A shows that a torsional overtone of order n has n nodal 
surfaces at depths determined by the velocity structure of the 
mantle. Thus the observed eigenfrequencies can be inverted to 
model the earth’s radial velocity structure. This process yields 
earth models that match the observed eigenfrequencies quite 
well, as illustrated by the dashed line in Fig. 2.9-2. Moreover, 
the results can be checked by combining them with travel time 
observations. For instance, before PKJKP body waves 6 were 
observed, the shear velocity of the inner core was constrained 
using normal modes like 10 5 2 that have large displacements in 
the inner core. 

Figure 2.9-10 shows a plot of the eigenfrequency versus 
angular order for torsional modes. The modes plot along dis¬ 
tinct lines, corresponding to overtone branches. The lowest line 
is the fundamental branch (radial order n- 0) with the lowest 
eigenfrequency (longest period) for any given angular order. The 

6 As discussed in Section 3.5, PKJKP is an elusive body wave phase that propagates 
in the inner core as a shear wave, and so provides information on the difficult-to- 
constrain shear velocity there. 
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Fig. 2.9-9 Radial (vertical) eigenfunctions for 
various modes as functions of depth from the surface 
to the core-mantle boundary, a: Torsional modes 
with a low angular order of / = 2 for the fundamental 
mode (n - 0) and higher overtones. The modes 
sample fairly evenly across the whole mantle, with 
the radial order giving the number of times the 
displacements change sign, b: Torsional modes 
with about the same frequency (14 mHz). When 
/ < ~ 4 n, the modes correspond to ScS sh waves, and 
the eigenfunctions span the whole mantle. When 
l>~4n, the modes correspond to SH waves that 
bottom in the mid-mantle, and the eigenfunctions 
tail off before reaching the core-mantle boundary, 
c: Second-overtone branch of Love wave-equivalent 
torsional modes. Because the radial order is always n 
- 2, the curves always have two zero crossings, so the 
displacement directions are always divided into three 
regions. The eigenfunctions get shallower at higher 
angular orders, d: Second-overtone branch of 
Rayleigh wave-equivalent spheroidal modes. 

As with b, the eigenfunctions get shallower at 
higher angular orders. The solid lines show the 
eigenfunction for radial displacements, U, and 
the dashed lines show the eigenfunction for 
tangential displacements, V. (Dahlen and Tromp, 
1998. Copyright © by Princeton University Press. 
Reprinted by permission of Princeton University 
Press.) 


lines of successively higher eigenfrequencies (shorter periods) 
define overtone branches with increasing n. On any branch, the 
eigenfrequency increases for higher angular order /. 

As we have seen, the angular order / relates modes to 
traveling waves of a specific wavelength (Eqn 20) or phase 
velocity (Eqn 21). Thus frequency-angular order plots for 
normal modes as in Fig. 2.9-10 correspond to dispersion (phase 
velocity-period) plots for surface waves (Fig. 2.7-8) and are 
sometimes called normal mode dispersion plots. 

Various regions of the torsional mode dispersion plot in 
Fig. 2.9-10 correspond to different body and surface shear wave 
(SH) phases, which are discussed further in the next chapter. 
The horizontal phase velocity (Eqn 21) of the waves cor¬ 
responding to a given mode can be related to the horizontal 
phase velocity of a surface wave or the apparent velocity of 
a body wave phase. The upper left of the figure, with high 
frequency and low angular order /, contains modes that con¬ 
tribute to body wave phases with high apparent velocities 
and thus near-vertical incidence (recall from Section 2.5.3) that 
c x = v/sin i), such as the core reflections (Figs 1.1-2 and 3.5-5) 


ScS, sScS , and ScS 2 - The dashed line corresponds to modes with 
a phase velocity around 7.3 km/s, which is the apparent velo¬ 
city of shear waves that diffract around the core. We will see 
that these SH diff waves bottom and turn at the core-mantle 
boundary, and so represent the transition between direct S and 
ScS, which reflects at the core-mantle boundary. To the right 
of the dashed line are modes corresponding to S wave phases 
that bottom in the mantle, like 5, 55, s5, sSS, and 555. Modes 
further to the right (higher /) for a given frequency have lower 
phase velocity, and thus correspond to body wave phases (Sec¬ 
tion 3.4) that bottom at shallower depths in the mantle. The 
difference is shown by the radial eigenfunctions (Fig. 2.9~9b) 
for torsional modes of about the same frequency. Modes to 
the left of 9 T 43 have significant displacement throughout the 
mantle, corresponding to phases that reach the core-mantle 
boundary, whereas those to the right increasingly correspond 
to phases that penetrate only to shallower depths. 

We can also consider modes that are equivalent to surface 
waves, bearing in mind a slight notational complexity that the 
higher (n > 0) overtone branches are sometimes termed “higher 
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Fig. 2.9-10 Frequency-angular order 
(dispersion) plot for torsional modes, 
computed using the PREM model 
(Dziewonski and Anderson, 1981). All 
torsional modes (28,588) with periods of 
12 s or greater are shown. They span 79 
radial orders (branches) and 941 angular 
orders (on the fundamental branch, where 
n = 0). The boxed region at the lower left 
is enlarged as an inset. Lines through the 
origin have constant phase velocity, like that 
shown for core-diffracted S waves S di ^, 
indicate the groups of modes that correspond 
to the body and the surface wave phases 
labeled. 



Fig. 2.9-11 Frequency-angular order 
(dispersion) plot for spheroidal modes, 
computed using the PREM model. All 
spheroidal modes with periods of 50 s or 
greater are shown. Note the complexity 
of the branches compared to the toroidal 
modes. The dashed lines show the phase 
velocities of modes corresponding to the core 
diffractions P^and S di ff sv (also called SV di ^). 
To the left of the Feline are modes 
corresponding to core reflected and 
transmitted phases like PcF, PKiKP, and 
the various branches of PKP. To the right 
of this line are modes corresponding to P 
waves that bottom in the mid-mantle. To the 
left of the S di ff sv line are modes corresponding 
to core reflected and transmitted phases like 
ScS and SKS (mixed in with the PcP and PKP 
modes to the left of the Feline). 

To the right of the S di ^ sv line are modes 
corresponding to S V waves that bottom in the 
mid-mantle. The first few mode branches at 
the right correspond to Rayleigh waves. (After 
Dahlen and Tromp, 1998. Copyright © by 
Princeton University Press. Reprinted by 
permission of Princeton University Press.) 
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modes” when discussing surface waves (Section 2.7.4). The 
torsional modes furthest to the right in Fig. 2.9-10, which are 
the lowest-overtone branches, can be viewed as Love waves. 
The n = 0 branch with / greater than about 20 corresponds to 
fundamental mode Love waves, that for n = 1 corresponds 
to the first Love wave overtone, and so on. The radial eigen¬ 
functions in Fig. 2.9-9c for the 2 T (n = 2) branch show that 


modes with successively higher / have displacements increas¬ 
ingly concentrated near the surface. This is consistent with our 
observation that higher-frequency (shorter-period) Love waves 
for a given overtone branch n have displacements closer to the 
surface (Fig. 2.7-10). 

The situation for spheroidal modes is more complicated 
(Fig, 2.9-11). The fundamental branch remains distinct, but 
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the overtone branches cross, because these modes include both 
P and SV energy. Some spheroidal modes involve primarily 
radial motion, and some involve primarily tangential motion, 
with a full spectrum in between. 

However, the basic patterns seen for torsional modes also 
apply for spheroidal modes. For a given frequency, modes 
to the left (low l) correspond to core phases, and those in the 
center correspond to mantle body wave phases, with the core 
diffraction (dashed lines) being the boundary between these 
groups of modes. The modes furthest to the right correspond to 
Rayleigh surface waves. The modes corresponding to P-wave 
phases are further to the left than their SV counterparts because 
P waves travel faster than S waves. These ideas can be visual¬ 
ized by considering the radial eigenfunctions (Fig. 2.9-9d). 
The two curves for each spheroidal mode represent the two 
radial eigenfunctions U(r) (radial) and V(r) (tangential). Modes 
with low / and higher n have larger deep displacements, so 
superposition of modes with very low / yields the core phases. 
For the low-order overtones (n = 2 is shown), the displacements 
are closer to the surface as / increases. Thus the n = 0 branch 
with / greater than about 20 corresponds to fundamental mode 
Rayleigh waves, and the higher branches (n = 1, 2, etc.) cor¬ 
respond to successively higher Rayleigh wave overtones. 

The equivalence between normal modes and propagating 
waves gives us a powerful tool. For example, in Section 3.5.5 
we will see models of wave propagation in the mantle that were 
computed using modes, and so include core reflections, diffrac¬ 
tions, and many other phases. Similarly, in Section 4.3.4 the 
radiation patterns showing how various faults radiate surface 
wave energy in different directions are computed using modes. 
Thus we use either mode or wave methods, depending on 
which seems easiest for a particular calculation. It is often 
useful to do both and compare the answers, using each method 
to bring different insight. 

2.9.7 Observing normal modes 

As with many basic seismological concepts, the idea of the 
planet’s modes developed long before instruments became 
available to observe them. As the theory of elasticity was 
developed in the mid-1800s, there were discussions of finding 
the “pitch” of the earth. 7 In 1882, Lamb modeled the earth as a 
homogeneous steel ball, and calculated a fundamental mode of 
78 minutes. In 1911, Love took into account the effect of grav¬ 
ity on radial motions of the earth, and revised the predicted 
fundamental period to 60 minutes, not far from the actual 
54 minutes. However, because making seismometers that can 
detect such long-period motions is difficult (Section 6.6), it was 
only after the great 1952 Kamchatka earthquake that this 
mode was actually observed on a strainmeter recording. 

7 Earth’s gravest observed mode, 0 S 2 corresponds to a note of E, twenty octaves be¬ 
low middle E on a piano. Johannes Kepler, among others, wrote about the “music of 
the spheres,” and thought that each planet’s revolution around the sun corresponded 
to a musical note. The earth’s 365.25 day revolution would correspond to a note of 
C#, 33 octaves below middle C#. 



Table 2.9-1 Some torsional and spheroidal modes. 


Mode 

Period 

Description or associated phase 

0 T 2 

2,639.4 

fundamental torsional 

0 T 3 

1,707.6 

fundamental torsional 

i r i 

808.4 

radial overtone 

iF 2 

757.5 

radial overtone 

sh 

104.4 

radial overtone 

0 T 0 

259.5 

fundamental Love 

0 C 3 O 

68.9 

fundamental Love 

2^30 

151.3 

second-overtone Love 

4^67 

71.3 

SH 

10^40 

71.4 

SHdfff 

13 T 

71.6 

^ c 5 SH 

(Ao 

1,228.1 

fundamental radial 

1^0 

613.0 

radial overtone 

fN 

l-o 

0 

3,233.5 

football 

0~*3 

2,134.4 

pear-shaped 

O 

Co 

UJ 

O 

262.1 

fundamental Rayleigh 

0^130 

75.8 

fundamental Rayleigh 

1~*30 

160.9 

second-overtone Rayleigh 

10^6 

203.5 

inner core PKJKP 

11*5 

197.1 

inner core PKIKP 

14^3 

184.9 

mantle ScS sv 

A 

19,500 

Slichter 


Sources: Dziewonski and Anderson (1981); Wysession and Shore (1994); 
Dahlen andTromp (1998). 


Advances in seismic instrumentation, together with the 
occurrence of the great 1960 Chilean and 1964 Alaska earth¬ 
quakes, made it possible to identify and study large numbers 
of modes. Over 40 modes were identified from the 1960 
Chilean earthquake. The number of modes that have been 
observed is now several thousands, due to continued advances 
in seismometry, which permit recording at very long periods 
(Section 6.6), an increase in the number of stations, more 
powerful analytical techniques, and many large earthquakes. 
Although none of the earthquakes has come close in size to 
the 1960 Chilean event, the advances in instrumentation and 
processing largely compensate. Large earthquakes are needed 
to excite the gravest modes, and long lengths of seismograms 
are needed to resolve their properties. As discussed in Sec¬ 
tion 6.3.3, this requires a seismogram that has significant en¬ 
ergy extending over a time much longer than a mode’s period. 
Fortunately this is the case for the largest earthquakes, leading 
to the analogy that the earth rings like a bell after they occur. 8 
Seismograms extending for many days are analyzed after the 
largest earthquakes. 

Table 2.9-1 shows the periods of several modes, some of 
which have been discussed earlier. Note that for the funda¬ 
mental ( 0 S and 0 T) overtone branches, modes with angular 
orders greater than about 20 correspond to the fundamental 
mode Rayleigh and Love waves with those periods and are 

8 Actually, because the earth vibrates at many frequencies, rather than just one, and 
is laterally homogeneous, a better but less poetic analogy would be that the earth 
rattles like a dented garbage can. 
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often viewed as traveling waves. However, the longest-period 
modes, like 0 S 2 , 0 S 3 , 0 T 2 , 0 T 3 , etc. have such long periods that 
we think of them as modes. Higher-order modes are often 
thought of in terms of a body wave phase to which they con¬ 
tribute. Of course, the descriptions are equivalent. 


2.9.8 Normal mode synthetic seismograms 

As we will see in many places in this text, various techniques 
are used to create theoretical, often called synthetic, seismo¬ 
grams for the earth. One of these is normal mode summation, 
analogous to the way the propagating waves on the string were 
generated in Section 2.2. This summation is also the way that 
a music synthesizer creates a particular sound by summing the 
right combination of harmonic overtones (i.e., modes). 9 

For example, torsional mode displacements (Eqn 11) are 
synthesized by 


U T (r r , e r , <j> r ) 

= 1X1 nAf(r s , r r ) n W,(r r )TT(e r , t r )eW e 

n l m=—l 


„coft 

U2, 


(23) 


To do this, we need to know the modes’ radial eigenfunctions, 
n W l and eigenfrequencies n cof, which are determined by the 
earth’s velocity and density structure. These modes are then 
weighted by excitation amplitudes n Af , determined by the 
depth, geometry, and time history of the seismic source and 
the depth of the receiver. We also need to know the attenu¬ 
ation, or quality, factor n Q h discussed in Sections 3.7 and 7.4, 
which measures the rate at which the mode’s seismic energy is 
lost by friction (without this effect, the earth would ring like a 
bell forever). This formulation assumes that all singlets in a 
multiplet have the same quality factor. 

The modes of the earth are found by computing the 
radial eigenfunctions and the corresponding eigenfrequencies. 
Although this process is beyond our scope here, several 
techniques have been developed to do this. Some involve 
propagating the values of stresses and displacements from the 
center of the earth to the surface, layer by layer, while satisfying 
the boundary conditions at each layer. The frequency of the 
mode is iterated until the final surface values satisfy the free 
surface boundary conditions. This process is analogous to that 
used to determine the periods of the Love waves in the layer 
over the halfspace example (Section 2.7.3). 

The amplitudes, or excitation coefficients, depend on the 
earthquake’s fault geometry. One of the many advantages of 
evaluating the normal modes in a coordinate system whose 


pole is at the seismic source is that the radiated energy has 
strong symmetry. As noted in Section 1.1 and discussed further 
in Chapter 4, earthquakes radiate energy in a pattern with four- 
lobed symmetry about the fault plane. Thus any given fault 
geometry is reflected by various combinations of the m = 0, ±1, 
and ±2 singlets. The excitation also depends on the depth of the 
source, much as that for the string depended on the source posi¬ 
tion. 10 An earthquake at 600 km depth strongly excites modes 
whose eigenfunctions are large at that depth, whereas other 
modes are barely excited. However, the relative excitations 
will be very different for an earthquake at 10 km depth. For 
example, as previously discussed, fundamental mode surface 
waves correspond to the fundamental (n = 0) branch of tor¬ 
sional and spheroidal modes, for angular orders greater than 
about 20. Because these modes’ radial eigenfunctions are small 
at great depths, a 600 km-deep earthquake does not excite 
surface waves efficiently. 

The modes are summed at a specific receiver location. Thus 
the displacements in Eqn 23 are expressed in terms of the 
radius of the source r s , the radius of the receiver r r , and the 
colatitude and azimuth of the receiver, d r and 0 r . A slightly 
disturbing feature of the mode sum (Eqn 23) is that both the 
time functions and the vector spherical harmonics are complex 
numbers. However, the sum gives the displacement as a real 
number. Similarly, although individual modes oscillate every¬ 
where on earth at all times, even before a traveling wave from 
an earthquake could arrive, the mode sum yields waves that 
arrive after a finite time. Thus, although modes are mathemat¬ 
ical objects that are hard to visualize, their sum gives rise to a 
meaningful physical displacement (Fig. 2.9-12). 

Figure 2.9-13 shows a comparison of observed seismo¬ 
grams with synthetic seismograms created using normal mode 
summation. The fits are good enough that many studies use 
observed normal mode amplitudes to find the fault geometry 
and focal depth of earthquakes, especially when they are large 
and remote from seismometers. This process is an inverse 
problem, corresponding to the forward problem of generating 
a synthetic seismogram. 

It is worth noting that while the synthetic receivers are usu¬ 
ally placed at the surface (where seismometers are), they can 
also be computed for any depth within the earth. Figure 2.9-14 
shows a record section that would be recorded at a distance of 
70° from an earthquake if seismometers could be placed a 
depths ranging from the surface to the core-mantle boundary. 
We will use this idea shortly to visualize shear wave propaga¬ 
tion (Section 3.5.5) by evaluating normal mode synthetic 
seismograms at 100,000 locations in the mantle. 


9 Although the fundamental notes for a clarinet, trumpet, and flute might be the 
same, playing each instrument excites a different suite of overtones, giving a different 
sound. For instance, because the open end of the clarinet allows only odd-numbered 
overtones, the absence of even-numbered overtones contributes to its warm, dark 
sound. Early synthesizers added only a few overthones, producing a false, tinny 
sound. Modern synthesizers sum overtones up to and beyond frequencies of 
20,000 Hz, the limit of human hearing, so the synthesized sounds can be indis¬ 
tinguishable from those of the actual instruments. 


2.9.9 Mode attenuation , splitting, and coupling 

So far, we have discussed the modes of a spherically sym¬ 
metric, nonrotating, purely elastic, and isotropic earth. This 

10 This effect is analogous to the way in which bowing at different locations on a 
violin string makes different sounds. 
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Fig. 2.9-12 Synthesis of a body wave 
seismogram using torsional normal modes. 
The numbered lines are mode sums for 
successive overtone branches, and their 
sum gives a seismogram including the 
core reflection ScS. (Figure by E. Okal. 
Reprinted courtesy of E. Okal.) 
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Fig. 2.9-14 Shear wave synthetic 
seismograms computed at a series 
of depths, all at a distance of 70° from a 
600 km-deep hypothetical earthquake. 
(After Wysession and Shore, 1994. 
PureAppl. Geophys142, 295-310, 
reproduced with the permission of 
Birkhauser.) 



idealized body, sometimes called a SNREI (“sneery”) earth, is 
a reasonable approximation, because the earth is approxim¬ 
ately spherically symmetric and elastic, and its rotation period 
is long compared to those of the normal modes. In this case, we 
expect the normal mode spectrum of an earthquake to show 
sharp peaks for each mode. However, when we look at data like 
Fig. 2.9-2, we see that some peaks vary in width and that some 
mode peaks overlap with others. These features reflect the com¬ 
plexities of making measurements of the modes of the real earth. 

The first effect worth noting is that seismograms are not 
infinitely long. Thus each mode’s displacement is not a pure 
sinusoid of single frequency extending for infinite time, but 


instead stops when the seismogram ends. We will see in Section 
6.3.3 that taking a finite portion of a sinusoid broadens its 
spectrum from a sharp spectral line (a delta function) to a wider 
peak. Physically, this is because other frequencies are needed 
to make the time function end rather than go on forever. The 
shorter the time we use, the worse the broadening is. This prob¬ 
lem seems easy to solve, since we can take a seismogram for as 
long as we want, and thus make peaks narrow. However, we 
do not want to go on too long, because the longer we wait after 
an earthquake, the more the earthquake’s signal will decay 
relative to the ground noise, which can include signals from 
other earthquakes. 
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Fig. 2.9-15 Amplitude spectra of the nine singlets of the split spheroidal 
mode multiplet 18 S 4 . The m - -4 singlet is in front, and the m = 4 singlet is 
in back. (Widmer et al 1992.) 

This consideration brings us to the next issue, that the 
modes’ amplitudes decay with time because attenuation con¬ 
verts the seismic wave energy to heat. As discussed in Section 
3.7, attenuation (sometimes termed anelasticity ) represents the 
deviation of the earth from perfect elasticity. This effect is 
modeled by describing the time history of a mode (Eqn 23) as 
the product of a periodic oscillation and a decay term 

„COjt 

(24) 

where n Q l is the mode’s attenuation, or quality factor, which 
we treat as the same for all singlets. Infinite Q corresponds to 
no attenuation, so the oscillation would continue forever, 
whereas lower Q (higher attenuation) causes the oscillation to 
decay rapidly. We will see that this effect broadens the spec¬ 
trum from a single line at frequency n cof to a wider peak, be¬ 
cause additional frequencies are needed to describe the time 
decay. The effects of attenuation on the spectrum are similar to 
that of taking a finite length of seismogram. If we correct for 
the finite seismogram, we can measure the Q of each mode. 
These data can then be used to determine how anelasticity 
within the earth varies with depth (Section 7.4). 

Other factors can also affect spectral peaks. For a SNREI 
earth, a mode’s frequency depends only on the radial order n 
and the angular order /, so the 2/ + 1 singlets of different 
azimuthal order -l < m < l would have the same eigen- 
frequency. However, in the real earth, the singlet frequencies 
vary slightly, causing mode splitting. The split singlets broaden 
the peak produced by the entire multiplet. Peaks due to indi¬ 
vidual singlets can sometimes be resolved on high-quality 
long-period seismograms (Fig. 2.9-15). To identify singlets, 
the analysis shown used a stacking method (Section 6.5) that 


exploited the fact that individual singlets within the multiplet 
have different surface eigenfunctions, so spectra at different 
stations can be weighted and combined to enhance the desired 
singlet and suppress others. 

The causes of mode splitting can be visualized by considering 
a mode multiplet to be a superposition of singlets correspond¬ 
ing to waves traveling along different paths around the earth. If 
the earth is spherical, nonrotating, and spherically symmetric, 
all these paths are of the same length and have the same travel 
times. However, if some paths take longer than others, the rela¬ 
tion between the wave velocity and eigenfrequency (Eqn 21) 
shows that the corresponding eigenfrequencies will differ. 
Thus splitting occurs when waves traveling on different paths 
encounter different velocities. Put another way, splitting occurs 
when the actual positions on earth of the source and the 
receiver, not just their relative positions, matter. 

Mode splitting due to the rotation of the earth reflects two 
effects. The direct effect is that the Coriolis force due to the 
rotation causes splitting, because waves traveling in the direc¬ 
tion of the rotation travel faster than those going the other way. 
The splitting is proportional to the ratio of the mode’s period to 
that of the earth’s rotation (24 hours), so this effect is largest 
for 0 S 2 and decreases for shorter-period modes. An indirect 
effect is that the rotating earth takes an elliptical shape (Sec¬ 
tion A.7), so waves traveling across the poles travel a distance 
67 km shorter than waves traveling around the equator, caus¬ 
ing the multiplets to be split. Figure 2.9-16 shows rotational 
and elliptical splitting for the 0 S 2 multiplet. The amplitudes of 
the split singlets are predicted to be greatest for m = ±1, smaller 
for ±2, and zero for 0. Interference between the singlets with 
slightly different frequencies causes the time series for the 
multiplet to show beating (Section 2.8.1). 

Mode splitting can be caused by any other process that 
causes some wave paths to be faster than others. Splitting 
results from lateral variations in velocity, or inhomogeneity, 
within the earth. Seismic velocities vary laterally at any given 
depth by a few percent at most, but these variations are vital for 
understanding tectonic effects, including mantle convection 
(Section 5.1). Thus, just as the average frequencies of mode 
multiplets are significant for determining the radial velocity 
structure of the earth, so the frequencies of singlets help resolve 
the three-dimensional structure. Splitting also results from 
seismic anisotropy (Section 3.6), which occurs when waves 
traveling in different directions through a region travel at 
different velocities. For example, Fig. 3.6-13 shows splitting 
resulting from anisotropy in the inner core. 

A related effect is called mode coupling. Recall that in the 
homogeneous string the modes were purely orthogonal and did 
not interact with each other. Similarly, in the ideal SNREI 
earth, energy is not transferred from the oscillations of one 
mode to another. However, real-earth effects like rotation, 
ellipticity, lateral inhomogeneity, and anisotropy affect not only 
the eigenfrequencies, but also the eigenfunctions. As a result, 
the eigenfunction of a given mode contains both the eigenfunc¬ 
tion it would have for a SNREI earth and perturbations due to 
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Fig. 2.9-16 Splitting observations for the football mode 0 S 2 from 
the great 1960 Chilean earthquake, recorded at station Isabella 
(California). Splitting causes the singlets to stand out as distinct peaks 
in the spectrum and the time series to show beating due to interference 
between the singlets. A synthetic seismogram, computed by predicting the 
singlet amplitudes and combining them in the time domain with the effects 
of attenuation and finite seismogram length matches the data better than 
a similar synthetic seismogram without rotational splitting. (Geiler and 
Stein, 1977; Stein and Geiler, 1978. © Seismological Society of America. 
All rights reserved.) 

contributions from the eigenfunctions of some other modes 
with very similar eigenfrequencies. Coupling can occur between 
modes on separate branches, between modes on the same 
branch, and even within a single mode multiplet between dif¬ 
ferent azimuthal orders. Thus, although an earthquake should 


excite only the m = 0, ±1, and ±2 singlets because it radiates en¬ 
ergy in a pattern with fourfold symmetry about the fault plane, 
energy is transferred to the other singlets. Some coupling also 
occurs between torsional and spheroidal modes, much as plane 
P-SV and SH waves can be coupled at a dipping interface (Sec¬ 
tion 2.5.2). Hence torsional modes can contribute to the radial 
displacement, which would not be possible for a SNREI earth. 
As a result, some spectral peaks in Fig. 2.9-2 have several mode 
labels, corresponding to modes with similar frequencies that 
are coupled. These composite modes are called supermultiplets. 

Although the theory of mode splitting and coupling is bey¬ 
ond our scope, it is worth noting that it is closely allied to con¬ 
ceptually similar problems in other branches of science. The 
splitting due to earth’s rotation is similar to that for waves in 
a rotating bowl of water, or to the Zeeman effect in atomic 
physics, where spectral lines are split by a magnetic field. The 
normal mode problems are addressed by exploring how per¬ 
turbations to the equation of motion due to rotation, ellipticity, 
lateral heterogeneity, etc. change the eigenfrequencies and 
eigenfunctions from those for an unperturbed (SNREI) earth. 

In summary, the peaks in a normal mode spectrum reflect 
the combined effects of the earthquake, spherical and elastic 
earth structure, attenuation, rotation, ellipticity, lateral hetero¬ 
geneity, and anisotropy. As a result of extensive studies, these 
effects are surprisingly well modeled, as shown by the good 
(though not perfect) agreement between the synthetic and ob¬ 
served spectra in Fig. 2.9-2. Thus, as is so often the case, data 
showing the deviations of the real earth from a simple model 
are used to explore these deviations and better describe the real 
earth. 


Further reading 

Further information about the topics of this chapter can be obtained from 
many sources, a few of which are listed here. Basic wave concepts are dis¬ 
cussed in books on wave propagation (e.g., Bland, 1988; French, 1971; 
Main, 1978), classical mechanics (e.g., Feynman et al ., 1963; Marion, 
1970), and applied mathematics (e.g., Butkov, 1968; Morse and Feshbach, 
1953; Menke and Abbott, 1990; Snieder, 2001). Introductions to topics 
in continuum mechanics are given by Fung (1965, 1969) and Malvern 
(1969). Fermat’s principle, Huygens’ principle, and diffraction are dis¬ 
cussed in optics texts like Baker and Copson (1950) and Klein and Furtak 
(1986). 

Several introductory texts treat the seismological material in this chap¬ 
ter, including Ewing et al. (1957), Officer (1958), Richter (1958), Bullen 
and Bolt (1985), Lay and Wallace (1995), Shearer (1999), and Udias 
(1999). Advanced treatments beyond our discussions are given by Aki and 
Richards (1980), Hudson (1980), Ben-Menahem and Singh (1981), 
Lapwood and Usami (1981), Kennett (1983), Bath and Berkhout (1984), 
and Dahlen and Tromp (1998). 

A number of sources discuss specific topics that we address. Geiler and 
Stein (1978) discuss string examples like those used here, including of the 
source term and of the modes of a non-uniform string. Young and Braile 
(1976) review the solutions for reflection and transmission at a solid-solid 
interface, and give the computer program used to calculate the energies in 
Fig. 2.6-11 and 12. Madariaga (1972) derives the equivalence between 
modes and traveling waves. 








11. For the strain tensor 


1. What are the reflection and transmission coefficients for a junction 
between two identical strings? Give a physical interpretation of the 
result. 

2. In Fig. 2.2-6, find the seismic velocities of the two different string 
segments by measuring the distance versus time slope of the wave 
pulses on the left and right sides of the figure. Are these velocities 
the same as the velocities given in the figure caption? 

3. For the stress tensor 

"2 1 3' 

1 -1 -2 

3-2 5 

V / 

find the traction on 

(a) the x-y plane, 

(b) the y~z plane, 

(c) the plane with normal (3,2, -1). 

4. To derive the reflection coefficient for the end of a string: 

(a) Express the total displacement due to incident and reflected 
harmonic waves of unknown amplitudes. 

(b) Find the relation between these amplitudes at a fixed string 
end, where the displacement is zero, and at a free end, where 
the traction is zero. 

5. For the stress tensor 

'o 2 O' 

( 7=200 
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v / 

(a) Find the principal stresses and their associated directions. 

(b) Find the surfaces on which the maximum tangential traction 
occurs, and the value of this traction. 

6. Estimate the pressure expected at a depth of 1000 km in the earth. 

7. Given the stress tensor, whose elements are in kbar: 

'-150 -2 l' 

cr= -2 -155 3 

1 3 -145 

V / 

(a) What physical situation do the large negative values on the 
diagonal represent? 

(b) What is the mean stress? 

(c) What is the deviatoric stress tensor? 

(d) At what depth in the earth might this state of stress be 
found? 

8. Give an example of a strain tensor for which there is 

(a) an increase in volume, 

(b) a decrease in volume, 

(c) shear strain but no volume change. 

Which of these strains could result from a P wave, and which could 
result from an S wave? 

9. Estimate by what fraction the volume of a block of a Poisson solid 
with the rigidity of crustal rock will be compressed at a depth of 
30 km relative to its volume at the earth’s surface. 

10. Determine whether the Lame constant A can be negative and, if so, 
under what conditions. 


/ \ 
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(a) Find the corresponding stress tensor, assuming an isotropic 
solid with Lame constants X and /r. 

(b) Find the stored elastic strain energy, W = 

12. Give a physical interpretation of the fact that Young’s modulus for 
rubber is less than that for steel. 

13. An alternative to using potentials to find seismic wave solutions to 
the equation of motion in terms of displacements is to formulate 
wave equations for the dilatation and curl of the displacement 
field. To see this: 

(a) Take the divergence of Eqn 2.4.12 to obtain a wave equation 
for the dilatation 6 . At what velocity does 0 propagate? 

(b) Take the curl of Eqn 2.4.12 to obtain a wave equation for 
V x u. At what velocity does V x u propagate? 

14. Derive the constitutive law (Eqn 2.3.70) for an isotropic and 
linearly elastic material using the c- kl in Eqn 2.3.69. 

15. Derive the ratio of P- and 5-wave velocities in a Poisson solid. 

16. Use the gradient operator in spherical coordinates (Eqn A.7.14) to 
find the displacement field from the spherical wave scalar poten¬ 
tial f(t - rlv)lr. How would you approximate the displacements 
near the source? How would you approximate the displacements 
far from the source? 

17. On a seismometer located at an earthquake hypocenter, the phases 
reflected from the core, PcP and Sc5, arrive at 8 minutes, 31 sec¬ 
onds, and 15 minutes, 36 seconds, respectively after the earth¬ 
quake. If the earth’s radius is 6371 km, and the core’s radius is 
3480 km: 

(a) Find the average P- and 5-wave velocities in the earth’s 
mantle. 

(b) Use these average velocities to estimate how close the mantle 
is to a Poisson solid. 

18. Estimate the P- and 5-wave velocities in the upper mantle by 
assuming that it is a Poisson solid, and that the earthquake for 
which seismograms are shown in Fig. 2.4-8 occurred at a depth of 
280 km. Compare these velocities to the average mantle values. 
Note that the seismograms do not start at the earthquake origin 
time. 

19. To get a feel for the distance and time scales in seismic wave pro¬ 
pagation, consider waves propagating in a material with velocity 
8 km/s. 

(a) Find the wavelengths of waves with periods of 0.1 s, 1 s, 
and 100 s. 

(b) Find the periods and frequencies of waves with wavelengths 
of 1 m, 1 km, and 100 km. 

20. For waves propagating in an arbitrary direction given by the 
wavenumber vector k, 

(a) Show that the P-wave displacement due to the scalar 
potential 

0(x, t) = e t{(at ~ k ' x) 

is parallel to the propagation direction. 
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(b) Show that the S-wave displacement due to the vector 
potential 

Y(x, t) = Ae^- k,x) , A = {A x , A y , A z ), 

is perpendicular to the propagation direction. 

21. For a medium composed of upper, middle, and lower layers with 
velocities of 6, 8, and 10 km/s, calculate the angle of incidence in 
the 8 and 10 km/s layers for a ray with an incidence angle of 10° 
in the 6 km/s layer. What is the smallest angle of incidence in the 
6 km/s layer that causes total internal reflection at the 8 km/s- 
10 km/s interface? 

22. For the two cases of an incident wave hitting a plane boundary 
between two media shown in Fig. P2.1, 

(a) Determine which waves are P waves and which are S waves. 

(b) Determine which media are liquid and which are solid. 

(c) For the two media in each case, determine which has the 
higher P-wave velocity. 

23. Consider two rays that originate from a source at % = 0, z = 0, 
in a medium with velocity 1 km/s with angles of incidence 0° 
and 30° (Fig. P2.2). Assume that these rays cross an interface at 
z = 2 km into a medium with velocity 1.5 km/s and travel to the 
boundary at z - 4 km. For each of the ray paths: 

(a) Compute the angle of incidence in the upper layer, the ray 
path length in each layer, and the total travel time. 

(b) Compute the components and magnitude of the slowness 
vector s = ( p , 77 ) in each layer. Check that the magnitude is 
related to the velocity as expected. 

(c) Derive the total travel time from the scalar product of slow¬ 
ness and distance (s • x) for the ray path. Remember to use 
the appropriate slowness components and horizontal and 
vertical distances in each layer. Check that these travel times 
agree with those from (a). 



Fig. P2.2 See Problem 23. 


24. Fermat’s principle problems: 

(a) Use Fermat’s principle to show that the angles of incidence 
for the incident and reflected waves at the surface of a homo¬ 
geneous halfspace (Fig. 2.5-13) are equal. 

(b) Use the second derivative of the travel time to determine 
whether the ray path in (a) is a minimum- or a maximum¬ 
time path. 

(c) Use the second derivative of the travel time to show that the 
refracted ray path in Fig. 2.5-14 is a minimum-time path. 

25. For an SV wave incident on a free surface: 

(a) Write the potentials for the incident SV wave and reflected P 
and 5 V waves. 

(b) Derive the continuity equations at the interface in terms of 
both the potentials and the amplitude coefficients. 

(c) Assume that the potential reflection coefficients, the ratios of 
the reflected SV and P potentials to that of the incident 5V 
wave, are 

B, 4 p\y B - top - P 2 ) 2 A, "tpBptop - P 2 ) 

B i ” 4p 2 V7 ,b + top - p 2 ) 2 ’ B i 4 P 2 VIp + top - P 2 ) 2 

Evaluate the potential reflection coefficients at vertical incid¬ 
ence, and explain the result physically. 

(d) Find the displacement magnitude ratios and energy flux 
ratios for the two reflected waves relative to the incident 
wave. 

(e) Show that the energy fluxes satisfy conservation of energy. 

26. Show that conservation of energy is satisfied by: 

(a) The energy flux for the incident, reflected and transmitted 
SH waves at an interface (Eqn 2.6.14). 

(b) The energy flux for the incident P wave and reflected P and 
S V waves at a free surface (Eqn 2.6.39). 

27. For the ScSp conversion at the top of the downgoing slab (Fig. 2.6- 
15), assume that ScS is traveling vertically in the slab, which dips 
at 30°. Assume that the velocities in the slab are a = 9.3 km/s and 
ft = 5.2 km/s, and the overlying mantle between the slab and the 
surface has velocities 0 ^ = 8.0 km/s and = 4-6 km/s. 

(a) Find the angle of incidence for ScS and ScSp at the top of the 
slab and at the earth’s surface. 

(b) Use this result and the seismograms shown to estimate the 
depth of the slab. Bear in mind that the ScSp and ScS arrivals 
observed at a given station originated from different points 
on the slab. 
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28. For a P wave incident on a horizontal solid-solid interface 
(Fig. 2.6-9): 

(a) Write the potentials for the incident P wave and reflected P 
and SV waves. 

(b) Derive the four continuity equations at the interface in terms 
of the potentials. 

29. For the Love waves in a layer over a halfspace, use the model in 
Fig. 2.7-9 to derive the cutoff frequencies for the first and second 
higher modes. Compare these results to the figure. 

30. A second way to study the downgoing slab is to use observations 
from Japan showing that earthquakes about 1300 km away can 
give rise to two P-wave arrivals, a small direct one and a larger 
one presumably reflected off the upper surface of the slab (Fig. 2.6- 
15). Using the geometry and velocities assumed in problem 27: 

(a) Determine the angles of incidence at the surface if the appar¬ 
ent velocities of the direct and reflected arrivals are 8.5 and 
16 km/s. 

(b) Determine the angle of incidence at the slab top of the reflec¬ 
tion. To see if the large amplitude of the reflection might 
occur because of near-critical incidence, compute this critical 
angle and compare the two. 

(c) Suppose that a P-to-S wave conversion also occurred at the 
slab top. For the converted wave, find the angle of incidence 
at the slab and the angle of incidence and apparent velocity 
expected at the surface. 

31. For Love waves in a layer over a halfspace, derive a vertical wave¬ 
length to show how the displacement oscillates with depth in the 
layer. Also, derive a vertical decay constant for the halfspace, a 
distance over which the displacement decays to e of its value at 
the interface. Show how these quantities vary with apparent velo¬ 
city for a given period. For different modes at a given period, inter¬ 
pret the result in terms of the rate at which the displacement 
oscillates in the layer and the depth of penetration in the halfspace. 

32. For a dispersive wave, derive the following relations between 
group velocity, phase velocity, wavelength, frequency, and period: 


(a )U = c 



(b) U = c 2 


dT 
dX ’ 


(c) U = -X 2 


§L 

dX 


33. Find the displacements for 0 T 3 as functions of 9 and <p in the 
manner done for 0 T 2 in Eqns 2.9.12 and 2.9.13. 

34. (a) Show that for m = 0, 


V/o(M) = 


'2/ + P“ 


4 n 


P,(cos 9). 


(b) Use (a) to find the spherical harmonic ^oo associated with 
radial modes n S Q . 

(c) Evaluate the vector spherical harmonics associated with the 
radial modes and explain what the results imply for these 
modes’ displacements. 

35. Using the relation between modes and traveling waves and the data 
in Table 2.9-1: 

(a) Because 0 T 2 samples the mantle fairly uniformly (Fig. 2.9- 
9a), assume that the phase velocity appropriate for this 


mode is the average mantle shear wave velocity from 
problem 17 and find the period you would expect. How does 
this compare to the actual period? 

(b) Find the phase velocity for the mode 0^130 and compare it to 
that for the Love wave of this period found in the dispersion 
calculation (Section 2.7.4). 

(c) Find the phase velocity for three modes with similar periods: 
4 T 67 , 10 T 40 , and 13 T 7 , and interpret the differences. 

(d) Find the phase velocities and wavelengths of waves corres¬ 
ponding to the modes 0 S 3 , 0 S 30 , and 0 5 13 q. Interpret the trend 
of the velocities. Which of these modes would you expect to 
be most affected by lateral heterogeneity in the earth, and 
why? 

36. (a) Show that the three vector spherical harmonics Tf, Sf, Rf 1 are 

orthogonal, and explain this result’s physical significance. 

(b) Show that there is no volume change associated with torsional 
modes, and explain this result’s physical significance. 

37. (a) Estimate the magnitude of the splitting of the 0 S 2 multiplet 

in Fig. 2.9-16a as the ratio of the separation in frequency 
between the m = ±2 singlets to the frequency of m = 0, which is 
essentially that of the unsplit multiplet. 

(b) We expect that the splitting would be of the order of the ratio 
of the unsplit mode’s period to that of the earth’s rotation. 
Compute this ratio and compare the result to the results of (a). 

Computer problems 

C-l. Write a subroutine to generate the values of the function 
cos (cot-kx). Use it to plot the function as a 

(a) function of time from £ = 0 to 10, atx = 1, for co= 1, k = 1. 

(b) function of time from t = 0 to 10, atx = 0, for co=4, k = 1. 

(c) function of position from x = 0 to 10, at t = 0, for co = 1, 
k-2. 

(d) function of position from x = 0 to 10, at t = 0, for co = 1, 
k = 4. 

C-2. Write a subroutine that uses the P and S velocities on either side of 
a solid-solid interface and the angle of incidence for a wave of 
a specific type to find the angles of reflection and transmission for 
both P and S waves. The subroutine should calculate and list 
any possible critical angles for that incident wave, and indicate 
whether any of the reflected or transmitted waves are past the 
critical angle. 

C-3. Write a program that takes the velocities and densities on either 
side of a solid-solid interface and finds the vertical incidence 
displacement reflection and transmission coefficients, and energy 
flux ratios, for P and S waves incident from either side. Use the 
program to estimate these quantities for the core-mantle bound¬ 
ary (although it is a solid-liquid boundary), if the lower mantle 
has a - 13.7 km/s, /J= 7.2 km/s, p = 5.5 g/cm 3 , and the core has 
02 - 8.0 km/s, fi 2 = 0.0 km/s, p 2 = 9.9 g/cm 3 . 

C-4. Write a program, using the result of C-2, to generate figures like 
the ray paths in Fig. 2.6-11 for an interface with given velocities 
on either side. Use the program to show the ray paths for the 
possible incident wave types on either side of a planar interface 
with the properties of the core-mantle boundary (problem C-3). 
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Ordinary language undergoes modification to a high pressure form when applied to the interior of the earth; a few examples of equi¬ 
valents follow: 

Ordinary meaning: 
dubious 
perhaps 

vague suggestion 
trivial objection 

uncertain mixture of all the elements 

Francis Birch, 1952 


High pressure form: 
certain 
undoubtedly 
positive proof 
unanswerable argument 
pure iron 


3.1 Introduction 

A major application of seismology is the determination of the 
distribution of seismic velocities, and hence elastic properties, 
within the earth. This distribution, known as earth structure , 
gives the basic constraint on the mineralogical, chemical, and 
thermal state of the earth’s interior. Seismological data are 
important for this purpose because their resolving power is 
generally superior to that of other geophysical methods. For 
example, although gravity and magnetic data indicate the pre¬ 
sence of a dense fluid core at depth, they provide only relatively 
weak constraints on its density and size. By contrast, seismo¬ 
logical data indicate the depth of the core-mantle boundary 
and the sharp change in properties that occurs there. Above the 
boundary, both P and S waves propagate in the solid mantle, 
whereas in the liquid outer core no S waves propagate and the 
P-wave velocity drops sharply. The observed velocities are the 
primary basis for our models of the physical properties and 
chemical composition of the material on either side of this 
boundary. Similarly, the distinction between the crust and the 
mantle and many inferences about their structure and composi¬ 
tion come from seismological observations. More generally, 
by establishing the essentially layered structure of the earth, 
seismology provides the primary evidence for the process 
of differentiation whereby material within planets became 
compositionally segregated during their evolution. As a result, 
many crucial issues about the other terrestrial planets could be 
resolved if seismological data were available. 


Constraints from seismology are crucial for other disciplines 
of the earth sciences, and vice versa. Seismology gives earth 
models describing the distribution of P- and S-wave velocities 
and density. Going from an earth model to a description of 
the chemical, mineralogical, thermal, and rheological state of 
the earth’s interior requires additional information. There are 
thus two types of uncertainty in our knowledge of the earth’s 
interior. In some cases, such as the structure of the inner core, 
the seismological results are still under discussion. In others — 
for example, the nature of the 660 km discontinuity in the 
mantle — the basic seismological results are generally accepted, 
but their mineralogic and petrologic interpretations remain 
under investigation. Given our scope here, we only summarize 
the implications of seismological data for models of the earth’s 
interior. 

The fundamental data for seismological studies of the 
earth’s interior are the travel times of seismic waves. The meas¬ 
urements available are the arrival times of seismic waves at 
receivers. To convert these to travel times, the origin time and 
location of the source must be known. These parameters, 
which are known for artificial sources, must be estimated from 
the observations for earthquake sources. Hence travel time 
data include information about both the source and the pro¬ 
perties of the medium, and separating the two is a challenge 
in many seismological studies. 

The travel times are used to learn about the velocity structure 
between the source and the receiver. As we saw in the last 
chapter, waves follow paths that depend on the velocity 
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structure. Hence the structure must be known to find the paths 
that the waves took. To illustrate this, consider the travel time 
between two points. If the velocity were constant, the ray path 
would be a straight line, and the velocity could be found by 
dividing the distance by the travel time. If, instead, an interface 
separates media with different velocities, the ray path would 
consist of two line segments, depending on the velocities, and 
the travel time would be the sum of the time spent along each 
segment. For a more complicated velocity distribution, the ray 
path would also be more complicated. 

This problem can be posed mathematically by writing the 
travel time between the source (s) and receiver (r) as the integral 
of 1/velocity, or slowness, along the ray path 


r 

T(s, r) = 

s 


- dx. 

v(x) 


( 1 ) 


In simple cases, where the ray path is a set of segments with 
constant velocity, the integral is just a sum over the time 
in each segment. Thus the travel time gives an integral con¬ 
straint on the velocity distribution between the source and 
the receiver, but does not indicate which of the many paths 
satisfying the constraint the ray followed. As a result, an 
individual measurement is inadequate to show the distribution 
of velocities. Fortunately, as we shall see, a set of travel times 
between different sources and receivers provides much more 
information. In addition, useful information is derived from 
the amplitudes and waveforms of seismic waves. 

This example illustrates an interesting feature of determining 
velocity structure from travel times. If the velocity structure is 
known, the forward problem of finding the travel times and 
amplitudes is straightforward. However, the inverse problem 
of using the travel times and amplitudes measured at the sur¬ 
face to find the velocity structure at depth is more difficult, and 
various methods are used. For example, in addition to using 
travel times directly, we have seen that velocity structure is 
studied using the dispersion of surface waves (Section 2.8) and 
the eigenfrequencies of normal modes (Section 2.9), quantities 
that correspond to travel times. 

In this chapter, we follow the approach discussed in Section 
1.1.2 of treating the earth with a series of progressively more 
complex and, hopefully, more accurate models. We begin with 
the homogeneous, isotropic, elastic, layered halfspace used in 
Chapter 2 to derive seismic wave propagation. This approx¬ 
imation of uniform flat layers is often used in crust and upper 
mantle studies, where the distance between source and receiver 
is less than a few hundred kilometers. We then consider larger 
source-receiver distances, for which spherical geometry is 
required, and then the anisotropic and anelastic behavior of the 
earth. Throughout these discussions, we will see that although 
velocity varies primarily with depth, there are important lateral 
variations, or heterogeneities. Finally, we consider the implica¬ 
tions of the observed heterogeneous, anisotropic, and anelastic 


velocity structure for the composition of the earth. Later, in 
Chapter 7, we discuss further how seismic data can be used to 
study laterally variable velocity structure. 


3.2 Refraction seismology 

3.2 .1 Flat layer method 

The simplest approach to the inverse problem of determining 
velocity at depth from travel times treats the earth as flat layers 
of uniform-velocity material. We thus begin by deriving the 
travel time curves for such a model, which show when seismic 
waves arrive at a particular distance from a seismic source. 
The travel times, especially those of waves that are critically 
refracted at the interfaces, are used to find the velocities of the 
layers and underlying halfspace and the layer thicknesses. As a 
result, this technique is called refraction seismology. 

Refraction seismology is used on vastly differing scales. 
Near-surface structure at depths less than 100 meters can be 
studied using a sledge hammer or a shotgun as a source and a 
single receiver. Similar methods are used to study the crust and 
the upper mantle, with earthquake or explosion sources and 
many receivers at distances of hundreds of kilometers. 

The simplest situation, shown in Fig. 3.2-1, is a layer of 
thickness h 0 , with velocity v Qi overlying a halfspace with a 
higher velocity, v v We write the velocities as “v” to indicate 
that the analysis applies for either P or S waves. There are three 
basic ray paths from a source on the surface at the origin to a 
surface receiver at x. The travel times for these paths can be 
found using Snell’s law. 

The first ray path corresponds to a direct wave that travels 
through the layer with travel time 

T d {x)=x/v 0 . (1) 

This travel time curve (Fig. 3.2-2) is a linear function of dis¬ 
tance, with slope 1A 0 , that goes through the origin. 

The second ray path is for a wave reflected from the inter¬ 
face. Because the angles of incidence and reflection are equal, 


x = 0 


Source Receiver 



Fig. 3.2-1 Three basic ray paths for a layer over a halfspace model. The 
direct and reflected rays travel within the layer, whereas the head wave 
path also includes a segment just below the interface. For the head wave 
to exist, the layer velocity v 0 must be less than the halfspace velocity v v 
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Fig. 3.2-2 Travel time versus source-to-receiver distance plot for the three 
ray paths in Fig. 3.2-1. The direct wave is the first arrival for receivers 
closer than the crossover distance x d . Beyond x d the head wave arrives 
first. The head wave exists only beyond the critical distance x c . 


the wave reflects halfway between the source and the receiver. 
The travel time curve can be found by noting that x/2 and 
h 0 form two sides of a right triangle, so 

T R (x) = 2(x 2 /4 + bt) la /v 0 . (2) 

This curve is a hyperbola, because it can be written 

T|(x) = x 2 /Vq + Ah\lv\. (3) 

For x = 0 the reflected wave goes straight up and down, with a 
travel time of T R (0) = lh Q lv Q . At distances much greater than 
the layer thickness (x » h), the travel time for the reflected 
wave asymptotically approaches that of the direct wave. 

The third type of wave is the head wave , often referred to as 
a refracted wave. This wave results when a downgoing wave 
impinges on the interface at an angle at or beyond the critical 
angle. Its travel time can be computed by assuming that the wave 
travels down to the interface such that it impinges at the critical 
angle, then travels just below the interface with the velocity of 
the lower medium, and finally leaves the interface at the critical 
angle and travels upward to the surface. Thus the travel time is 
the horizontal distance traveled in the halfspace divided by v t 
plus that along the upgoing and downgoing legs divided by v 0 : 


t h{x) = *- 2 VE .. k + 


2hr 


V n COS Z 


= — + 2 h r 


1 


tan z. 


v 0 cos i c 


The last step used the fact that the critical angle (Section 2.5.5) 
satisfies 


sin i c = v 0 /v v (5) 

To simplify Eqn 4, we use trigonometric identities showing that 
cos i c = (1 -sin 2 i c ) lf2 = (l -VqIv \) 112 (6) 


and 


tan z. 


sin z. 




cos i c (1 . vl/v'f ) ,/2 ’ 

so Eqn 4 can be written 

T H {x) = x/v 1 + 2h 0 {livl~ l/v 2 ) 1,2 = x/v 1 + x v 


( 7 ) 


( 8 ) 


Thus the head wave’s travel time curve is a line with a slope 
of 1 !v 1 and a time axis intercept of 


T 1 =2h 0 {l/vQ- 1 lv\) 


1/2 


(9) 


This intercept is found by projecting the travel time curve 
back to x = 0, although the head wave appears only beyond the 
critical distance;, x c = 2 h 0 tan i c , where critical incidence first 
occurs. 

Because l/v 0 > 1/zq, the direct wave’s travel time curve has a 
higher slope but starts at the origin, whereas the head wave has 
a lower slope but a nonzero intercept. At the critical distance 
the direct wave arrives before the head wave. At some point, 
however, the travel time curves cross, and beyond this point the 
head wave is the first arrival even though it traveled a longer 
path. The crossover distance where this occurs, x d , is found by 
setting T d (x) = T H (x), which yields 


Xj = 2 hr 


U/2 


\ v t~ v 0j 


( 10 ) 


Hence the crossover distance depends on the velocities of the 
layer and the half space and the thickness of the layer. 1 

Thus we can solve the inverse problem of finding the velocity 
structure at depth from the variation of the travel times ob¬ 
served at the surface as a function of source-receiver distance. 
This simple structure is described by three parameters. The two 
velocities, v 0 and v v are found from the slope of the two travel 
time curves. We then identify the crossover distance and use 
Eqn 10 to find the third parameter, the layer thickness, h 0 . 
Alternatively, the layer thickness can be found from the reflec¬ 
tion time or the head wave intercept (Eqn 9) at zero distance. 
Each of these methods exploits the fact that there is more than 
one ray path between the source and the receiver. 


1 A simple analogy is driving to a distant point by a route combining streets and a 
highway. If the destination is far enough away, it is quicker to take a longer route 
including the faster highway than a direct route on slower streets. The point at which 
this occurs depends on the relative speeds and the additional distance required to use 
the highway. 
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Fig. 3.2-3 Generation of an upgoing head wave by Huygens’ sources due 
to a refracted pulse propagating along a boundary. The head wave travels 
in the upper layer at a slower velocity (v 0 ) than the refracted wave creating 
it, which travels in the layer below at velocity v r (After Griffiths and King, 
1981.) 


Despite this solution’s elegance, the basic assumption about 
the travel time of the head wave may seem unsatisfying, because 
it is unclear why energy should follow this path. However, 
the result conforms with observations — the experiment 
diagrammed in Fig. 3.2-1 yields an arrival whose travel time is 
given by Eqn 8. To understand why, we can view the head wave 
in several ways. As shown in this chapter’s problems, it cor- 
responds to a minimum time path between the source and the 
receiver, so, by Fermat’s principle (Section 2.5.9), we expect 
such a wave. Another approach, using Huygens’ principle 
(Section 2.5.10), is to consider the refracted wave traveling 
horizontally below the boundary at the velocity of the half¬ 
space, generating spherical waves that propagate upward in the 
lower-velocity layer (Fig. 3.2-3). The spherical waves interfere 
to produce upgoing plane waves that leave the interface at the 
critical angle. 2 However, our analysis of postcritical incidence 
(Section 2.6.4), which showed that an evanescent wave pro¬ 
pagates along the interface, does not fully describe the head 
wave. A more sophisticated analysis than is appropriate here 
shows that the geometry in Fig. 3.2-1 gives the head wave’s 
travel time, but not its amplitude, because geometrical optics 
are not applicable. Thus, although the energy propagation is 
more complicated than along the geometric ray path, the travel 
time predicted is correct. 

Seismic refraction data led A. Mohorovicic 3 in 1909 to 
one of the most important discoveries about earth structure. 
Observing two P arrivals (Fig. 3.2-4), he identified the first as 
having traveled in a deep high-velocity (7.7 km/s) layer, and 
the second as a direct wave in a slower (5.6 km/s) shallow layer 
about 50 km thick. These layers, now identified around the 
world, are known as the crust and the mantle. The boundary 
between them is known as the Mohorovicic discontinuity, 
or Moho. We now denote the head wave as P n and the direct 
wave as J^(“g” for “granitic”). Corresponding arrivals are also 
observed for S waves. The Moho, which defines the boundary 


2 This situation is analogous to a bow wave from a boat or a supersonic wave from a 
jet airplane, in that the energy source travels faster than the wave it produces. 

3 Andrija Mohorovicic (1857-1936), working in Zagreb, Croatia (then part of the 
Austro-Hungarian Empire), studied travel times from earthquakes in the region using 
recently invented pendulum seismographs. 



Fig. 3.2-4 Schematic of Mohorovicic’s results showing the existence of a 
distinct crust and mantle. The travel time curves are labeled using modern 
nomenclature: the direct waves are P g and S , and the head waves are P n 
and S n . (After Bonini and Bonini, 1979. Eos, 60, 699-701, copyright by 
the American Geophysical Union.) 


between the crust and the mantle, has been observed around 
the world. One of the first steps in studying the nature of the 
crust is characterizing the depth to Moho, or crustal thickness, 
and the variation in P n velocity from site to site. 

Travel time plots for refraction experiments can be made by 
displaying seismograms in record sections. Because seismo¬ 
grams are functions of time, aligning several as a function of 
distance yields a travel time plot showing the different arrivals. 
Figure 3.2-5 shows a record section of a profile of seismograms 
recorded in England from explosive sources. In addition to 
P n and P g , the reflection off the Moho, known as P m P , is well 
recorded. As expected, the direct and head wave travel times 
are linear with distance, whereas the reflection has a hyperbolic 
curvature. The figure is plotted as a reduced travel time plot , 
in which the time shown is the true time minus the distance 
divided by a constant velocity. This reduces the size of the plot, 
and makes waves arriving at the reducing velocity appear as a 
line parallel to the distance axis. 

The geometry discussed here can correspond to different 
physical experiments. A single source can be recorded simul¬ 
taneously at receivers at different distances. Alternatively, 
multiple sources at different distances can be recorded by a single 
receiver at different times. A single receiver can be moved away 
from a fixed source, so the same source is recorded at different 
distances. Similarly, a source can be moved away from a fixed 
receiver. Results of various experiments can be combined, 
using the principle of reciprocity , which states that the travel 
time is unchanged if the source and the receiver are inter¬ 
changed. As a result, we can use travel time measurements 
without considering whether the source was at one position 
and the receiver at another, or the reverse. Moreover, because 
earth structure presumably is not changing during the experi¬ 
ment, data collected at different times can be combined. 
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Fig. 3.2-5 Seismograms from a refraction 
profile, plotted with a reducing velocity of 
6 km/s. The direct wave P g , Moho head 
wave P n , and Moho reflection P m P are 
observed. P g does not asymptotically 
approach P m P as in Fig. 3.2-2 because 
the crust, instead of being homogeneous, 
has increasing velocity with depth. (Bott 
etal., 1970. From Mechanism of Igneous 
Intrusion, ed. G. Newall and N. Rast, 

© 1970 by John Wiley & Sons Ltd. 
Reproduced by permission.) 
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Refraction data often show other arrivals in addition to P g7 
P n9 and P m P. Figure 3.2-6 shows a record section that also con¬ 
tains head waves P- and P n 2 from boundaries within the crust 
and the mantle and P 2 P, a reflection off a mid-crustal interface, 
which is analogous to the P m P reflection off the Moho. 

Such data require a model with multiple layers. Figure 3.2-7 
shows a model in which a head wave arises at each interface 
where the velocity increases with depth. The travel time curve 
for a head wave at the top of the n th layer is a line with slope 
1 lv n , that can be extrapolated to its intercept on the t axis, T n , 
and written 

T Hn {x) = xlv n + t m , i 11 ) 

where, by analogy to the layer over the halfspace case (Eqn 9), 

r n = 2 n ih i{ l/vj-yvir. ( 12 ) 

;=0 

The thickness of successive layers can be found by starting with 
the top layer, whose thickness h Q is given by Eqn 9 or 10, and 
continuing downward using the iterative formula 

n-2 

h ,= --. (13) 

Kihu-uvir 


Thus for two layers over a halfspace, the thickness of the 
second layer is found by setting n- 2, so 

T, - 2h n (l/vl - llv\) m 

h= 2(itv]-iivl) 111 ' ( ’ 

A few examples illustrate some other complexities of refrac¬ 
tion experiments. If the velocity increases with depth, the travel 
time curve for the head wave at the top of each successive 
layer has a shallower slope. By contrast, a low-velocity layer 
(Fig. 3.2-8) does not cause a head wave, so the travel time curve 
does not have a first arrival with the corresponding velocity, 
and depths to interfaces calculated using Eqn 13 are incorrect. 
Another possible problem occurs if a layer is thin or has a small 
velocity contrast with the one below it. Although a head wave 
results, it may never appear as a first arrival (Fig. 3.2-9), caus¬ 
ing a blind zone that can be missed in the interpretation. 

3.2.2 Dipping layer method 

The refraction method can also be applied if the interfaces be¬ 
tween layers are not horizontal. Conducting a reversed profile 
yields the travel times for ray paths in both the down-dip 
and the up-dip directions. This can be done using receivers on 
either side of a source, sources on either side of a receiver, 
or both. In this geometry, the depths to the interface below the 
source and the receiver differ due to the dip angle, $. Consider 
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Fig. 3.2-6 Seismic refraction record 
section, plotted with a reducing velocity 
of 6 km/s. In addition to P g , P n , and P m P , 
there are also arrivals P t and P n 2 interpreted 
as head waves from boundaries within the 
crust and the mantle, and P ; P, interpreted 
as a reflection off a mid-crustal interface. 
(Snelson etal., 1998.) 


the down-dip ray path (Fig. 3.2-10) from a source, below 
which the perpendicular distance to the interface is h d , to a 
receiver at a distance x, below which the perpendicular dis¬ 
tance to the interface is (h d + x sin 9). The travel time for the 
head wave in the down-dip direction is the sum of the distance 


along the interface divided by v t plus that for the upgoing and 
downgoing legs divided by v 0 


TAx) 


_ x cos 9 - (2 h d + x sin 9) tan i c (2 h d + x sin 9) 


(15) 


V a COS t. 
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Fig. 3.2-7 Ray paths and travel times for a 
multilayered model in which velocity 
increases with depth. Each layer gives rise to 
a head wave whose intercept on the time 
axis is t if and a reflection The direct wave 
arrival is also shown. 
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Fig. 3.2-8 Travel time curves, showing first arrivals only, for 
a model with three layers over a halfspace. Because the middle 
layer is a low-velocity layer with v x < v Qi no head wave arises 
at its top. 



Source 


Receiver 



Fig. 3.2-9 Travel time curves, showing first arrivals only, for 
a blind zone geometry where the head wave from the top of 
layer 1 is never the first arrival because this layer is too thin. 



Source Receiver 



Fig. 3.2-10 Head wave ray path in the down-dip direction for a dipping 
interface over a higher-velocity halfspace. The layer thickness is measured 
perpendicular to the interface. 
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For the flat case, (9 = 0), this is just Eqn 4. Simplifying using 
Eqns 5 and 7 yields 


which is a straight line with slope l/v d and intercept T d . 

Similarly, the travel time for the head wave in the up-dip 
direction is 

m v x sin (i - 9) 2 h.. cos i r x M 

T» =-^-- + — 2 -- = — + r u > < 17 ) 

^0 V 0 V u 

where h u is the perpendicular distance to the interface below 
the receiver. Thus the apparent velocities, corresponding to the 
slopes of the head wave travel time curves, differ in the up-dip 
and down-dip directions by a factor depending on the dip 
angle, 

v u = v 0 /sin (i c - 9) v d = v Q /sm ( i c +0 ). (18) 




x cos 9 sin i (2 h d + x sin 9)( 1 - sin 2 i c 


Vr\ COS i 


sin (L + 9) 2b j cos L x 
- ^ 1 + — d -£■ = — + T d , 


The apparent velocity in the up-dip direction is greater than 
the halfspace velocity, and that in the down-dip direction is 
smaller. The time axis intercepts 

T „ = 2h u cos <>0> = lh d C0S h !v 0’ 


( 16 ) 


( 19 ) 
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Fig. 3.2-11 Travel time plot for a reversed profile and its interpretation. 
The up-dip and down-dip slopes and intercepts differ. 


also differ. The direct wave travel time is the same in both direc¬ 
tions, so the crossover distances differ. 

The results of a reversed profile are often displayed in the 
form shown in Fig. 3.2-11. The time axis is common to both 
directions, but distance is measured from one end of the axis 
for the up-dip experiment and from the other for the down-dip. 
The slopes of the direct and head wave travel times yield the dip 
angle 


sin 


-l u o 


sin 


-i u o 


uj 


and the critical angle 


sin -1 — + sin -1 — 

v d V uJ 


( 20 ) 


( 21 ) 


The halfspace velocity v 1 is found from the critical angle and i/ Q , 
and the intercept times then yield the layer thickness. 

Two additional points about reversed profiles are worth 
noting. First, the different up-dip and down-dip head wave 
travel time curves do not imply that for a given pair of loca¬ 
tions, it makes a difference whether the source is up-dip and the 
receiver down-dip, or the reverse (Fig. 3.2-12). By reciprocity, 
the two experiments give the same travel time. Thus, for a ray 
path connecting two points, it does not matter whether the 
wave travels up-dip or down-dip. By contrast, for two receivers 
at the same distance from a source, one up-dip and one down- 
dip, the travel times differ because the ray paths encounter the 
dipping interface at different depths. Similarly, the travel times 


Source Receiver 



Same travel times 
down-dip and up-dip 



Different travel times 
down-dip and up-dip 


Fig. 3.2-12 Left: If the source and the receiver are interchanged on a 
reversed refraction profile, the travel time is unchanged. Right: Different 
up-dip and down-dip travel times occur because, for a given source 
position, waves going the same distance along the surface in opposite 
directions sample the dipping interface differently. 


differ for two sources at the same distance from a receiver, one 
up-dip and one down-dip. If the dip were zero, then the travel 
times would be the same for all these cases because all ray paths 
encounter the interface at the same depth. Another way to 
view this is that for a flat geometry the travel time depends only 
on the distance between the source and the receiver. For a dip¬ 
ping geometry, the position as well as the separation matters, 
because the depth to the interface varies. 

Second, the dip found from a reversed profile is not a true 
dip if the profile is not perpendicular to the strike of the layer. 
Instead, the measured dip is an apparent dip along the profile. 
The true dip can be found from the apparent dips along two 
reversed profiles that cross at a reasonably large angle, using 
a standard technique in structural geology. 

3.2.3 Advanced analysis methods 

Because the analysis above has been for simple geometries and 
uniform-velocity layers, refraction seismology might seem of 
little use in understanding the real earth. Fortunately, this is 
not the case. The simple geometries give models that fit data 
reasonably well and provide starting models for more sophistic¬ 
ated analyses. 

Data from experiments showing travel times more complex 
than predicted by simple geometries can be interpreted with 
a computer program to trace rays using Snell’s law through 
possible velocity structures. The predicted travel time curve 
is found by taking rays that arrive at a given distance, and 
integrating the slowness along their paths (Eqn 3.1.1). 
Figure 3.2-13 shows a record section and the inferred velocity 
structure for a refraction survey in central California. Ray paths 
calculated through the structure shown yield a good fit to the 
complicated travel time data. For example, the late arrivals 
about 8 km from the source are interpreted as resulting from a 
low-velocity region associated with a set of faults. The model 
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Fig. 3.2-13 Reduced travel time plot and 
ray tracing results for a seismic refraction 
survey. The solid line on the travel time 
plot shows the travel times predicted 
by the model. (Meltzer etaL, 1987. 

© Seismological Society of America. 

All rights reserved.) 
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also fits the travel times showing several velocity increases 
beyond this distance. 

The restriction of uniform-velocity layers can also be sur¬ 
mounted. Geological instincts (a useful but occasionally unreli¬ 
able tool) lead us to expect that rock types, and thus velocities, 
should often vary smoothly rather than in discrete jumps. Thus 
we expect velocity gradients with depth, rather than sharp in¬ 
terfaces. This possibility can be tested using advanced methods 
of analysis that predict both the travel times and the amplitudes 
of the expected arrivals. The amplitudes make it possible to 
distinguish gradients from uniform layers, even if the travel 
times predicted are the same. Although the methods are beyond 
our scope here, we discuss some results briefly. 

To illustrate the relation between velocity structure and 
amplitudes, consider theoretical, or synthetic, seismogram re¬ 
cord sections for the head wave, P n , and Moho reflection, T m P, 
predicted by two crustal models (Fig. 3.2-14). The seismo¬ 
grams were computed using a method known as reflectivity, 
which avoids the limitations of ray and plane wave analysis. 
The travel times are reduced at 8 km/s, and the direct wave 
is not shown. Both models have the same average velocity 


structure, a 30 km-thick layer of 6.5 km/s material over an 
8 km/s halfspace, so the travel times are similar. However, the 
amplitudes of the arrivals differ noticeably because the models 
have different fine structure near the Moho. 

For the sharp Moho model (Fig. 3.2-14, top) the reflected 
wave is small for distances less than the critical distance 
(.subcritical reflection), largest near the critical distance, and 
large for distances greater than critical ( supercritical, posterity 
ical, or wide angle reflections). Because the boundary is sharp, 
this amplitude behavior is similar to that predicted for plane 
waves (Fig. 2.6-11). P m P also shows the expected phase shift 
for reflection past critical incidence (Section 2.6.4). The head 
wave first appears near the critical distance, 83 km, and is 
small, as expected from the plane wave approximation that 
predicts no transmitted wave past the critical angle. 

Figure 3.2-14 ( bottom) shows the effect of velocity gradients 
above and below the Moho. Seismic energy trapped near the 
Moho yields larger P n amplitudes than for the sharp Moho 
case. In addition, for subcritical distances, the reflection is 
smaller than without a gradient above the Moho, because it 
no longer reflects off a sharp interface. Hence the amplitudes 
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Fig. 3.2-14 Synthetic seismograms showing 
how the amplitudes of the head wave, P n , 
and the reflected wave, P m P, depend on the 
velocity structure at the Moho. Two cases 
with the same average-velocity structure 
are shown. At the top the Moho is a sharp 
transition, and at the bottom there are 
gradients above and below the Moho. The 
velocity scale shows the slopes of arrivals 
with different velocities. (After Braile and 
Smith, 1975.) 


of P n and P m P indicate the presence or absence of gradients at 
the Moho. 

Figure 3.2-15 illustrates these ideas for the oceanic crust 
and the mantle. Theoretical seismograms (Fig. 3.2-15, center) 
computed for a layered model that fits travel times predict 
strong reflections off the top of layer 3 (P 3 P) and the Moho 
(P m P). The observed data (Fig. 3.2-15, bottom) show strong 
P m P reflections, suggesting a sharp Moho transition. However, 
strong P 3 P reflections are not observed, implying that the trans¬ 
ition between layers 2 and 3 is a gradient rather than a sharp 
jump. Thus, although the results of refraction studies are often 
reported as layered models that fit the travel times, amplitude 
studies are needed to show whether sharp interfaces exist. 

An interesting point is that, because layers are distinguished 
from gradients by interpreting the amplitudes of seismic waves, 
this distinction depends on the wavelength of the wave used to 
study the structure. A reasonable approximation is that waves 
“see” only structures longer than their wavelengths. In other 
words, waves are affected by the medium properties averaged 
over their wavelengths. For example, the velocity structures 
in Fig. 3.2-16 appear identical to waves with a wavelength 
of 1 km, but look quite different for a wavelength of 1 m. 
Thus profile 3 appears as a sharp interface for waves with 


wavelength 1 km, a gradient for 100 m wavelength, and a stack 
of layers for 10 m wavelength. The velocity structure depends 
on the wavelengths under discussion, so a velocity “gradient” 
is a structure that cannot be distinguished, with the wave¬ 
lengths used, from one in which velocity changes smoothly. 
Similarly, an “interface” is a region that cannot be distin¬ 
guished from a sharp velocity change with the wavelengths 
used. 

3.2.4 Crustal structure 

Information about crust and upper mantle structure around 
the world has been acquired by refraction surveys conducted 
on different scales. The size of the sources and the source-to- 
receiver distances increase with the depth of the structures 
being studied. Earthquakes or large explosions, including 
nuclear weapons tests, have enough energy to reveal the Moho. 
For example, the profile in Fig. 3.2-5, which showed clear 
Moho arrivals, was almost 250 km long and used sources con¬ 
taining 136 kg of explosive. Shorter profiles are used to study 
structure within the crust, as in Fig. 3,2-13. The recording 
stations are either permanent seismic stations or, in most cases, 
portable seismometers. Refraction studies are also conducted 
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Fig. 3.2-15 Top: Oceanic crust model with 
sharp transitions between layer 1 (water), 
layer 2 (unconsolidated sediment), layer 3 
(crustal rock), and the mantle. Center: 
Synthetic seismograms for this model. P 2 , 
P 3 , and P n are head waves from layers 2, 3, 
and the mantle. P 3 P and P m P are reflections 
off the tops of layer 3 and the mantle. 
Bottom : Data showing an absence of the 
large P 3 P arrivals predicted by the layered 
model. (After Spudich and Orcutt, 1980. 
Rev. Geophys. Space Phys., 18, 627-45, 
copyright by the American Geophysical 
Union.) 





2 4 6 
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Fig. 3.2-16 Different velocity profiles that are indistinguishable when 
examined by using 1 km wavelength seismic waves, but distinguishable 
with much shorter wavelengths. (Spudich and Orcutt, 1980. 

Rev. Geophys. Space Phys., 18 ,627-45, copyright by the 
American Geophysical Union.) 


at sea. In some cases, disposable sonobuoys or retrievable 
ocean bottom seismometers are deployed, and a ship steams 
away firing “shots.” In other cases, two ships are used. Marine 
refraction data (e.g., Fig. 3.2-15) are analyzed by treating 
the water as an upper layer of known velocity. The refrac¬ 
tion results are combined with those from seismic reflection 
techniques, discussed in the next section, in which the velo¬ 
city structure is derived from the travel times of subcritical 
reflections, rather than refractions. Refraction and reflection 
results are complementary and yield improved knowledge of 
structure. 

The oceanic crust is about 7 km thick, and is relatively uni¬ 
form from site to site, except at mid-ocean ridges. As a result, a 
single simple model like that in Fig. 3.2-15 is often applicable. 
By contrast, the continental crust is thicker and variable, as 
illustrated in Fig. 3.2-17 for a cross-section across the west 
coast of the United States. The thin crust beneath the Pacific 
Ocean thickens across the continent-ocean transition, such 
that beneath the coast ranges the Moho is about 25 km deep. 
Beneath the Sierra Nevada range, the depth to the Moho 
reaches 35-40 km. The refraction data also show complicated 
and variable-velocity structures within the crust. Thus the crust 
is not a uniform layer, or even a uniform set of layers, because 
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Sedimentary rocks; K,T 


Granitic rocks; J,K 


Intermediate intrusive rocks; J,K 


Franciscan-type marine 
metasedimentary and 
metavolcanic rocks; M z , C z 

Undifferentiated metamorphic 
rock, refers to Sierra Nevadan 
foothills Belt-type rocks beneath 
Great Valley; PC, P z , M z 


Intermediate and mafic 
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equivalents; C z 
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Fig. 3.2-17 Crustal velocity model and 
inferred geologic structure for a cross- 
section across the west coast of the USA. 
“SAF” denotes the San Andreas fault. 
Dashed lines indicate low-velocity zones. 
(After Mooney and Weaver, 1989. 

From Geophysical Framework of the 
Continental United States, ed. L. C. Pakiser 
and W. D. Mooney, with permission of 
the publisher, the Geological Society of 
America, Boulder, CO. © 1989 Geological 
Society of America.) 


in some places it contains velocity gradients. Although early 
refraction studies suggested the existence of the Conrad dis¬ 
continuity dividing the upper and lower crust, it now appears 
that high (greater than about 6.5 km/s)-velocity lower crust 
is present in some places but not in others. Furthermore, some 
areas show low-velocity zones within the crust. 

Refraction studies show regional variations in crustal 
thickness and P n velocities, as illustrated for North America 
in Fig. 3.2-18. East of ~104°W, the crust is typically thick 
(~42 km), and P n velocities are high (~8.1 km/s). To the west, 
the crust is often thinner, with lower P n velocities. The thin 
crust and low P n velocities beneath the Basin and Range pro¬ 
vince may reflect hotter material near the surface, consistent 
with active extension. As seen here and globally (Fig. 3.2-19), 
mountain ranges often have thick crust. The thick crust is 
thought to be due to isostasy, whereby the excess mass of the 
mountains is at least partially compensated by a crustal root 
with density less than that of the mantle. 

The continental Moho can be modeled as a simple inter¬ 
face for the wavelengths used in most refraction studies. How¬ 
ever, seismic reflection studies, with shorter wavelengths, 
sometimes show a laminated structure of high- and low-velocity 
layers (Fig. 3.2-20). In other cases, however, the Moho is not 
observed in reflection data. Some of these complexities may 
reflect difficulties associated with seismic reflection studies in 
laterally varying media (Section 3.3). Nonetheless, the Moho 
appears to be a complicated transition zone 0-5 km wide, with 
properties varying between locations (Fig. 3.2-21). Rather than 
regarding the Moho as the base of a homogeneous crustal 


layer, it is better to view it as a zone where velocities increase 
rapidly with depth to values above about 7.7 km/s. 

Velocity structures are often interpreted in terms of com¬ 
position, as in Fig. 3.2-17. To do this, seismological results are 
combined with other geophysical data (e.g., gravity), geolog¬ 
ical fieldwork, and laboratory studies of the seismic velocities 
of rocks. The laboratory data show that velocity varies with 
composition, as shown in Fig. 3.2-22 for igneous rocks of 
the crust and upper mantle. Moreover, velocity increases with 
pressure and decreases with temperature. Inferences about 
composition are thus made by comparing predicted velocit¬ 
ies to seismic observations. For pressures expected at greater 
depths, as for the lower mantle and core, laboratory experi¬ 
ments are more difficult, so thermodynamic calculations are 
also used to extrapolate experimental data to higher temper¬ 
atures and pressures. 

Such analyses imply that the upper continental crust has 
an average composition like granodiorite, whereas the upper 
oceanic crust is gabbroic. 4 Historically, two types of models 
have been suggested for the Moho. In one, the Moho divides 
chemically different rocks, whereas in the other, it is a phase 
boundary separating rocks with the same bulk chemistry but 
different minerals. These models correspond to different com¬ 
binations of rocks on either side. Two candidates for the lower 
continental crust are gabbro or rocks of intermediate composi¬ 
tion in the granulite facies. The most popular candidate for the 
upper mantle is peridotite, which would make the Moho a 

4 Some relevant rock and mineral nomenclature is summarized in Section 3.2.5. 
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Fig. 3.2-18 Crustal thickness (depth to 
Moho) {top) and P n velocity ( bottom) 
maps for part of North America. Contour 
intervals are 5 km and 0.1 km/s. (Braile et 
ai, 1989. From Geophysical Framework 
of the Continental United States , ed. 

L. C. Pakiser and W. D. Mooney, with 
permission of the Geological Society of 
America, Boulder, CO. © 1989 Geological 
Society of America.) 



compositional boundary. Another candidate is eclogite, a rock 
with the same bulk chemistry as gabbro, but denser mineral 
phases. If the upper mantle were eclogite and the lower con¬ 
tinental crust gabbroic, the continental Moho would be a phase 
boundary. However, although eclogite and peridotite have 
similar seismic velocities, peridotite seems a more likely com¬ 
position for the upper mantle. One of the reasons is that 
olivine, a major component of peridotite, yields anisotropic 
seismic velocities due to its crystal structure. Such anisotropic 


P n velocities are observed in the oceanic upper mantle and in 
some locations in the continental upper mantle (Section 3.6). 

The status of the lower continental crust is more contro¬ 
versial. A granulite model is popular, but gabbro cannot be 
ruled out. Similarly, the origin of the laminated structure of the 
Moho is still unclear. Possible explanations include meta¬ 
morphosed sediments, cumulate layering, tectonic banding, 
and lenses of partial melt. In any event, this structure seems to 
be laterally variable. 
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Fig. 3.2-19 Global map of crustal 
thickness. (Mooney etal, 1998. 

/. Geophys. Res., 103, 727-47. Copyright 
by the American Geophysical Union.) 
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Fig. 3.2-20 Seismic reflection profile from 
the Wichita Mountains of southeastern 
Oklahoma. The “ringing” Moho reflections 
at 14.5-15 s in the middle of the section 
suggest that the Moho has a laminated 
velocity structure over several km. (Hale 
and Thompson, 1982./. Geophys. Res., 

87, 4625-35, copyright by the American 
Geophysical Union.) 


3.2.5 Rocks and minerals 

Interpreting seismological results for the crust and mantle 
in terms of composition requires knowing something about 
rocks and the minerals that compose them. Although these are 
complicated subjects, we summarize a few essential terms. 

For our discussions of crust and upper mantle structure, 
the most important rocks are the igneous rocks formed by 
cooling a molten magma. These rocks are classified primarily 
by the weight percent of silica, Si0 2 . A common nomenclature 
describes rocks as acidic or silicic for a weight percent of Si0 2 > 
66%, intermediate for 66-52%, basic or mafic for 52-45%, 
and ultrabasic or ultramafic for < 45 %. 

Physical properties of rocks, such as density and seismic 
velocity, depend on their mineral composition. Figure 3.2-23 


summarizes the major minerals in various rocks at near¬ 
surface temperatures and pressures. Because rock names refer 
to a range of compositions, those shown are averages. Rocks 
of the same composition have different names depending on 
whether they form at the earth’s surface (extrusive rocks) or 
below it (intrusive rocks). Hence an extrusive rock of gabbroic 
composition is a basalt. 

Several important silicate (Si0 2 -bearing) minerals are 
mentioned in the figure. Quartz is pure Si0 2 . Olivine is a 
solid solution, (Mg, Fe) 2 Si0 4 , whose composition varies from 
pure Fe 2 Si0 4 (fayalite) to pure Mg 2 Si0 4 ( forsterite ). Due to 
its crystal structure, olivine has anisotropic seismic velocities. 
Pyroxene is a solid solution with end members MgSi0 3 
( enstatite ), FeSi0 3 ( ferrosilite ), CaMg(Si0 3 ) 2 ( diopside ), and 
CaFe(Si0 3 ) 2 ( hedenbergite ), though only certain ranges of 
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Fig. 3.2-21 Schematic model for the 
continental Moho as a laminated structure. 
Refraction studies using relatively longer 
wavelengths would show clear P m P and 
P n arrivals, whereas reflection studies 
using shorter wavelengths would show 
reverberations. (Braile and Chiang, 1986. 
Reflection Seismology , 257-72, copyright 
by the American Geophysical Union.) 




Fig. 3.2-22 Variation of P-wave velocity with lithology for crust and 
upper mantle rocks, at a pressure of 1.5 kbar (150 MPa). Velocity 
increases with decreasing silica content. (Fountain and Christensen, 
1989. From Geophysical Framework of the Continental United States , 
ed. L. C. Pakiser and W. D. Mooney, with permission of the Geological 
Society of America, Boulder, CO. © 1989 Geological Society of 
America.) 


Extrusive rocks: Rhyolite Dacite Andesite Basalt Komatiite 

Intrusive rocks: Granite Granodiorite Diorite Gabbro Peridotite 
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Fig. 3.2-23 Simplified igneous rock classification. Compositions are 
shown as the volume percent of major minerals for a rock of given silica 
content (horizontal axis). Thus a granodiorite of about 60% silica content 
contains about 20% amphibole, 5% biotite, 53% plagioclase feldspar, 
17% quartz, and 5% potassium feldspar. Rock names are given for 
intrusive and extrusive forms. 


compositions exist in nature. Feldspar is a solid solution 
with end members CaAl 2 Si 2 0 8 ( anorthite), NaAlSi 3 O g ( albite ), 
and KAlSi 3 O g ( sanidine, orthoclase , and micro dine). The 
Na- and Ca-rich feldspars are called the plagioclase feldspars. 
A similar mineral group, the amphiboles, include 
hornblende, NaCa 2 (Mg,Fe) 4 (Al,Fe)(Si 3 A10 n ) 2 (0H) 2 . Biotite, 
K(Mg,Fe) 3 Si 3 AlO 10 (OH) 2 , and muscovite, KAl 2 Si 3 AlO 10 (OH) 2 , 
are in a group of minerals called micas. Garnets are minerals of 
the form A 3 B 2 (Si0 4 ) 3 , where A is usually one of the ions Ca, 


Mg, or Fe, and B is typically any of Al, Fe, or Cr. Garnets are 
comparatively dense, and thus significant for discussions of 
phase changes. 

The figure describes rocks in terms of their mineralogy 
at surface conditions. With increasing pressure due to increas¬ 
ing depth in the earth, minerals transform to denser phases. 
Thus, for example, a gabbro containing plagioclase feldspar, 
pyroxene, and olivine transforms to a chemically identical 
eclogite rock containing quartz, pyroxene, and garnet. Hence 
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an argument against eclogite being a major component of the 
upper mantle is that, by contrast with peridotite, it does not 
contain olivine and would not yield the observed anisotropic P n 
velocities. However, the gabbro-to-eclogite transformation 
may occur in subducting slabs (Section 5.4.2) and play a role in 
causing earthquakes there. 

3.3 Reflection seismology 

In the last section, we concentrated on the use of refracted 
arrivals to infer velocity structure with depth, and noted that 
reflected arrivals also contain valuable information for this 
purpose. Studies using the reflected arrivals, known as reflec¬ 
tion seismology , determine velocities within the crust, and 
thus are essential in oil and gas exploration. As a result, data 
acquisition and processing methods have often been developed 
first by reflection seismologists. For example, digital data were 
generally used in exploration before they became common 
in earthquake studies. Similarly, because reflection data are 
densely sampled in space and time, and the mathematics of 
wave propagation in a layered medium is simpler than for 
a spherical earth, techniques are often first developed with 
reflection data. In this section we survey basic concepts in 
reflection seismology, some of which we later apply to earth¬ 
quakes and the spherical earth. 

33.1 Travel time curves for reflections 

We first consider the simplest geometry: a flat layer of uniform- 
velocity material underlain by a halfspace with a higher 
velocity (Fig. 3.2-1). Although most applications use P waves, 
we write the velocity as “v” because the results also apply to 
S waves. For a layer of thickness h 0 with velocity v Qi we saw 
in Section 3.2.2 that the travel time as a function of source-to- 
receiver distance, known as offset in reflection seismology, is 

T(x) 2 = x 2 /vQ + 4hl/vl =x 2 /vQ + tQ. ( 1 ) 

The travel time curve T(x) is a hyperbola (Fig. 3.3-1) that 
intercepts the T axis at t 0 = 2h 0 lv 0 , the travel time at zero 
offset. This time is called the two-way vertical travel time, 
because the corresponding ray traveled vertically down to 
the reflector and back. Although this curve is the same as the 
“reflected wave” curve in Fig. 3.2-2, the convention in reflec¬ 
tion seismology is to plot time increasing downward, 1 because 
later arrivals reflect deeper in the earth. 

The layer velocity is found from the slope of the hyperbola. 
Because the slope decreases with increasing velocity, “flatter” 
travel time curves indicate higher velocities. To see this, note 
that a plot of T(x) 2 versus x 2 has slope Vv\. Alternatively, the 
variation in travel time with offset is often stated in terms of 



Fig. 3.3-1 The travel time curve for a reflection off a flat interface is a 
hyperbola, with the minimum at x = 0 corresponding to a vertical ray 
path. The slope is zero at x - 0 and increases with the offset distance. 



Fig. 3.3-2 Two rays showing the relationship between the angle of 
incidence, ray parameter, and the slope of the travel time curve for a 
flat medium. 

normal moveout (NMO), the difference between the travel 
time at some offset and that at zero offset, 

T(x)-t 0 = (x 2 /vj + t%) ll2 -t 0 . (2) 

Once the velocity is found, the layer thickness is given by the 
vertical travel time. 

To see the relation between the travel time curve and ray 
paths, consider the ray paths to two points dx apart, which dif¬ 
fer in travel time by dT (Fig. 3.3-2). Because the ray paths differ 
in length by vdT, the angle of incidence can be found using 

. . vdT 

sm * = — (3) 

dx 

or, in terms of the ray parameter p (Section 2.5.7), 
sin i dT 

P = -(4) 

v dx 


Earthquake seismologists generally follow the opposite convention. 


This is consistent with our earlier definition of the ray para¬ 
meter as the reciprocal of the apparent velocity along the 
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Fig. 3.3-3 Ray geometry for a reflection in a flat-layered medium. Layer 
thicknesses are bp horizontal distances traveled in the layers are Xp and 
one-way travel times spent in the layers are AT . 


surface of the wave front, which moves a distance dx in time 
dT, because 

p = llc x = U{dx/dT). (-5) 

Thus the ray parameter and the angle of incidence of the ray 
emerging at a distance % can be found from dTldx , the slope 
of the travel time curve evaluated at x. From Eqn 2, the slope 
is zero at x = 0 and then increases with offset; so the angle of 
incidence is nearly zero (vertical incidence) at short distances 
and becomes closer to 90° (horizontal) at larger distances 

(Fig. 3.3-1). 

This lets us find the travel time curve for reflections in a 
geometry with multiple horizontal layers. Figure 3.3-3 shows 
that the reflection R n+1 from the top of the (n +1 ) th layer (or the 
bottom of the n th layer) has traveled through n layers, each 
of thickness h j and velocity Vp Such rays, which have been 
reflected only once, are known as primary reflections. Because, 
by Snell’s law, the ray parameter p is constant along a ray, the 
incidence angles f in each layer can be found from the incidence 
angle i 0 in the top layer, 

sinT sin L 


allow us to compute the corresponding travel time curve T(x), 
consider a single layer, where x 0 {= x/2) is the horizontal dis¬ 
tance along each of the downgoing and upgoing legs. In this 
case, Eqn 8 becomes 

T(x) =2[(x/2) 2 + bg] m /v 0 , ( 9 ) 

because 

cos i 0 = h 0 (x$ + h$)~ 1/2 . ( 10 ) 

Hence Eqn 8 yields Eqn 9, which is equivalent to the relation 
we derived earlier showing that the travel time curve for the 
reflection is a hyperbola (Eqn 1). 

For multiple layers, we approximate the travel time curve 
for the reflection R n+1 off the top of the (n + l) th layer as a 
hyperbola, 

T(*)L = * 2 /V^, (11) 

and find the two parameters, V n and t n . t n is the total two-way 
(up and down) vertical travel time at zero offset, which is twice 
the sum of the one-way vertical travel times Af ; . for each layer 

t n = ^h = 2 W v i)' (12) 

j—0 h 0 

The velocity term, V n , is a little trickier. From the geometry, the 
distance traveled by the downgoing ray in layer / is 

Xj = VjATj sin i j = {vjiv () )AT j sin (i 0 ), (13) 

where the last step used Snell’s law (Eqn 6). Hence, by Eqn 7, 
the total distance, x, can be written 

x = 2 £ x . = 2 ^XvfAT r ( 14 ) 

j—0 V 0 j-0 

Because the ray parameter is constant along a ray, the slope of 
the travel time curve is, by Eqn 4, 


A downgoing ray, which travels a horizontal distance x -in the 
/ th layer, spends a time AT- in the layer. Thus, in going down 
and up again, the ray travels a total horizontal distance 

n n 

*(P) = 2]£x; = 2XV a n‘/ (7) 


H = !^ = x/{ 2j iV jAT j ). (15) 

dx v 0 j=0 

For the hyperbolic approximation (Eqn 11), the slope of the 
travel time curve is 


in a total time 


dx VlT 


T(p) = 2j J AT r 2^—^~. ( 8 ) 

/to ho v i cosl i 

We explicitly write x(p) and T(p), because the two sums are 
formulated in terms of the ray parameter. To see how they 


so we define 

n=( 2 i^ AT i) /T - 

7-0 
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Because this was derived for an arbitrary incidence angle, 
vertical incidence can be used for simplifications, so in each 
layer the travel time equals the one-way vertical travel time, 

n 

AT- = At:, and the total travel time is T=2 X A t-. Hence 

1 1 i =o 7 



/ \ 
n 


f n ) 

V 2 = 

r n 

I 

w° J 

/ 

5>/ 

d=° J 


V n , the appropriate average velocity for the travel time curve, is 
the time-weighted root mean square , or rms , velocity for the 
first n layers. This hyperbolic approximation and the exact 
solutions agree well except for large offsets. 

These results let us find the layer velocities from the travel 
time curves. Given a reflection from the top of the n th layer, 
with vertical two-way travel time and rms velocity V n _ v 
and a reflection from the top of the (n + l) th layer, with vertical 
two-way travel time t n and rms velocity V n , the velocity in the 
« th layer is 



Distance (km) 



p 2 = ^ ndn ^ n-E'n-X / 1 Q| 

n t ~t , 

n n ~1 

This relationship is called the Dix equation} The resulting 
velocity, called an interval velocity , is better determined for 
larger offsets, where the slope of the travel time curve is greater. 
Because the later reflections have higher velocities, and hence 
flatter travel time curves (Fig. 3.3-4), larger offsets are required 
to determine velocities at greater depths. 

Travel time calculations are more complicated for dipping 
layers. Figure 3.3-5 shows the geometry for a reflector of dip 6, 
whose depth along the perpendicular to the reflector below the 
origin is h. The travel times can by derived using an imaginary 
source on the line from the surface source normal to the reflec¬ 
tor, at the same distance below the layer, so that travel times 
from the imaginary source to the receivers are the same as from 
the true source. Applying the law of cosines to triangle RIS 
shows that 

T 2 = [x 2 + 4h 2 - 4hx cos (6+ kI2)]Ivq 

- [x 1 + 4h 2 + 4hx sin Q]/vq. (20) 

This travel time curve is a hyperbola with minimum at 
-2 h sin 0, so it is not symmetric about x = 0. Reflections from 
a stack of dipping layers yield travel time curves of approxi¬ 
mately this form. 

It is sometimes useful to think of the earth as having a 
continuous distribution of velocity with depth, v(z ), rather 
than a stack of discrete layers, each with uniform velocity. The 
expressions for the ray path and travel time of a ray with ray 
parameter p for discrete layers can be generalized. The ray path 
(Fig. 3.3-6) is given by Snell’s law, because the ray parameter, 

2 Named after its discoverer, pioneering exploration seismologist C. Hewitt Dix 
(1905-84). 


Fig. 3.3-4 Travel time curves for reflections (left) from a layered structure 
(right) corresponding to continental crust. Reflections from deeper 
interfaces are flatter, or have shallower slopes, due to the increase of 
velocity with depth. 



-2 h sin 0 x 



Fig. 3.3-5 The travel time curve for a reflection off a dipping interface can 
be derived using an imaginary source (I) at depth that gives the same travel 
times. The resulting hyperbola has a minimum at a nonzero offset. S and 
R denote the source and the receiver. 


p = sin ilv(z), (21) 

is constant along a ray. If velocity increases with depth, sin i 
and thus i increase, so the ray bends away from the vertical on 
its way down. Once i = 90°, the ray turns, becomes horizontal, 
and then goes upward. At the deepest point, the turning, or 
bottoming, depth z p , the velocity is the reciprocal of the 
ray parameter, p = 1 !v(z p ). If on some portion of the ray path 
the velocity decreases with depth, the ray bends toward the 
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Fig. 3.3-6 Ray path in a medium with velocity increasing smoothly with 
depth. The ray parameter is constant along the ray path, so the angle of 
incidence changes as the velocity changes. The incidence angle is smallest 
at the surface, where velocity is lowest, and is 90° at the bottoming 
depth, z p . 


vertical. The ray does not turn upward until it gets below the 
low-velocity region. 

We thus replace the sums over layer thickness h- with 
integrals over depth, such that the expression for the distance 
traveled by the ray (Eqn 7) becomes 


x(p) = 2 tan i dz = 2 p 


because 

si ni = pv{z) and cos i = (1 -sin 2 i) lfl = (1 -p 2 v 2 {z)) m . (23) 

This is sometimes written in terms of the slowness, the recipro¬ 
cal of velocity, as 

u(z) = l/v(z), (24) 

so that 


x(P) = 2 p 


{u 2 {z) - P : 


Similarly, the travel time sum (Eqn 8) becomes 


_ dz _ 

v(z)( 1 - p z v 2 {z)) m 


T(p) = 2 -— = 2 

viz) cos i 


u 2 {z)dz 


{u 2 [z) - p : 


This integral is valid everywhere except at the exact bottom of 
the curve, where u{z) equals p. A useful way to view this is to 
note that the ray path (Fig. 3.3-6) can be written as an integral 
over ds , where dz = cos i ds. The travel time is thus 


= u{z)ds. 


the integral of the slowness along the ray path. Slowness, 
though less intuitive to use than velocity, 3 can lead to simpler 
formulations. 

3.3.2 Intercept-slowness formulation for travel times 

So far, we have given travel time curves as T(x), the travel 
time as a function of distance. We now develop an alternative 
formulation that offers interesting insights and is useful for 
data analysis. To do so, we note that AT-, the one-way travel 
time in the / th layer with velocity is related to the thickness, 
h ; and the horizontal distance traveled, x ; - (Fig. 3.3-3), by 

v } AT r (x 2 + h 2 ) 112 . (28) 

The incidence angle i- for this ray satisfies 


■ • x i x i 

Sin l: — --- = —-— 

; (x 2 + b 2 ) m v j AT j 


We rewrite Eqn 28 as 


+ h 2 

vAT, = — 1 - 1 — =Xj sin t. + h, cos 

' ' (x} + h}) vl ’ ' ' r 


(xj + hf) m VjATj 


h. cos j, 

+ - - = PiXj + rijhj, 


where 


pj = (sin tflv- = sin i j u j and = (cos ifiv- = cos i-u-. (32) 

Thus in layer / we have entities introduced in Section 2.5.7: 
pj is the ray parameter, or horizontal slowness, and rfj is the 
vertical slowness. These are the components of the slowness 
vector that has magnitude equal to the slowness, and points 
in the direction of wave propagation. Hence «the slowness in 
the layer, is 

u 2 = l/vj = p 2 + r{ 2 . (33) 

By Eqn 31, the travel time a ray spends in a layer is the sum of 
the horizontal slowness times the horizontal distance traveled 
and the vertical slowness times the vertical thickness. The total 
travel time is the sum over all layers, with a factor of two to 
account for both downgoing and upgoing legs, 

3 It is somehow harder to think of a zone of high slowness than a low-velocity zone. 


Distance 


?(p) = T(p)-px(p), 


(38) 



Fig. 3.3-7 Relation between the travel time curve T(x) and the line 
tangential to a point on it, which has a slope, or slowness, p and 
a time axis intercept t. 


and differentiate 


dz 

dp 


dT dx dT dx dx 

- p — - x(p) = -—— ~ p — - x(p) = -x(p). 
dp dp dx dp dp 


(39) 


Thus, just as p is the slope of the travel time curve, T(x), the 
distance, x, is minus the slope of the z(p) curve. 

To illustrate these ideas, we show that the z(p) formulation 
gives the travel time curve for the reflected wave in a layer over 
a halfspace. Figure 3.3-3 shows that x 0 = x/2, so, using Eqn 32, 


v 0 [(x!2) 2 + b 2 ] m 


v n [(x/2) 2 + hl] m 


Hence, by Eqns 36 and 37, the travel time curve is 


T(X) = 2y AT ; . = 2 Y J p j x j + l^hj. (34) 

/=0 ;=0 /'=0 

By Snell’s law, the horizontal ray parameter is constant along 
the ray path, so p ■ = p , and 

n 

T(x) = px + 2j i r ll h l> (35) 

7=0 

n 

where x = 2 is the total horizontal distance traveled. This 

formulation is equivalent to the way we formulated the travel 
time as the scalar product of the distance and slowness vectors 
(Eqn 2.5.34). 

Formulating the travel time curve in this way gives interest¬ 
ing insight. We define 

T(x) =px+ z(p), (36) 

where the function 

r(p)=2f i r lj b j =2j^(l/vj - p 2 )' l2 h j = 2£(«? - p 2 ) 1/2 A> ; , (37) 

7=0 7=0 7=0 

Because p is the slope of the travel time curve (dT/dx) and hence 
of a line tangential to it at the point (T, x), z is the intercept 
of the tangent line with the time axis (Fig. 3.3-7). In general z 
and p differ for different points on the travel time curve, so the 
travel time curve can be described by the values of either (T, x) 
or (t, p). Thus the function z(p) is called the intercept-slowness 
representation of the travel time curve. Although less intuitive, 
the z{p) formulation is equivalent to T(x). 

Given that the slope of the travel time curve T(x) has special 
significance, it is natural to investigate the slope of the function 
z{p). To do this, we write Eqn 36 with the ray parameter, rather 
than the distance, as the independent variable, 


T(x)=px + 2ri 0 b 0 = 


(x 2 /2) + 2frg 

v 0 [(x/2) 2 + h%] 112 


= 2[(x/2) 2 + b 2 ] 1!2 /v 0 , 


(41) 


which is the familiar hyperbola (Eqn 9). 

To see how this travel time curve appears when written as 
T(p), we write Eqn 37 for a layer over a halfspace: 


t(p) = 2(l/v 2 -p 2 ) ll2 h 0 . 


(42) 


This can also be written as 


{vlz 2 )l(4hl) + v\p 2 = 1, (43) 

which is an ellipse whose axes are the rand p axes (Fig. 3.3-8). 
It intersects the Taxis at (r= t 0 = 2h 0 /v 0 , p = 0), and the p axis at 
(r= 0, p = 1 lv 0 ). Both these points have significance. The first, 
where the travel time curve has zero slope and the time axis 
intercept is the vertical two-way travel time, corresponds to 
the zero-offset point x = 0. 

The second, where the travel time curve has slope l/v Q and 
time axis intercept 0, is the z(p) position of the linear travel 
time curve for the direct wave. Hence the line for the direct 
wave maps to a point in the z(p) plane that is on the ellipse 
describing the reflected wave. To understand why this occurs, 
we use the fact that distance is minus the derivative of the z(p) 
curve (Eqn 39) and differentiate Eqn 42, giving 

x(p) - -dz/dp = 2 ph Q ( 1 /vq - p 2 )“ 1/2 , (44) 

so at the point p - 1/t 0 , x = oo. This makes sense, because as 
x —> oo, the reflected wave is asymptotic to the direct wave 
(Fig. 3.2-2). 

The head wave is easily mapped into the z{p) plane, because 
its travel time curve (Eqn 3.2.8) is 
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Fig. 3.3-8 Travel time curves T(x) for a layer over a 
halfspace and their representation in the (r, p) plane. 

Each point on the T(x) curves has a slope (ray parameter) 
p and intercept x. The linear travel time curves for the 
direct and head waves each map into a point (square and 
circle) in the (t, p) plane. The hyperbolic travel time curve 
for the reflection maps into an ellipse in the (t, p) plane. 
Note how an arbitrary point on the reflection’s travel 
time curve, marked by the diamond, maps into the other 
two curves. 



Fig. 3.3-9 Relation between the travel time curve T{x) and the 
function x{p) for multiple layers over a halfspace. D denotes the 
direct wave; R ; and are reflections and head waves at the top 
of the z' th layer; x c is the critical distance for H v (After Diebold 
and Stoffa, 1981. Reproduced by permission of the Society of 
Exploration Geophysicists.) 




T H (x) = x/v t + 2h 0 {l/vl~ l/vl) m 

= x/v t + t v (45) 

a line with slope equal to the reciprocal of the halfspace 
velocity, p = l/v v and intercept t 1 . Thus the head wave maps 
into a point on the ellipse describing the reflected wave, corres¬ 
ponding to the critical distance where the head and reflected 
waves are the same. To see this, note that for p = 1/zq, Eqn 44 
gives 

x{p) = -drfdp = 2h 0 v 0 {v\ - v^Y 112 = x c . (46) 


This point divides the ellipse describing the reflected wave into 
a subcritical portion, between the T axis and the head wave, 
and a postcritical portion, between the head wave and the p 
axis. We will see shortly that the fact that different arrivals 
have distinct locations in the t{p) plane provides the basis for 
techniques that can separate these arrivals. 

This analysis can be extended to more complex geometries. 
For multiple layers, the z{p) curves corresponding to reflec¬ 
tions off successive layers are all portions of different ellipses 
(Fig. 3.3-9). For a continuous velocity distribution, the summa¬ 
tion for r(Eqn 37) becomes an integral 
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Fig. 3.3-10 Schematic geometry of a multichannel seismic reflection 
survey with a single source (star) and eight receivers (dots) moving along 
a survey line. Each physical experiment produces eight seismograms 
corresponding to ray paths (dashed lines) with a single source location 
and a range of receiver locations. Four seismograms from different source 
and receiver positions, corresponding to the ray paths shown by solid 
lines, sample the same point at depth on a flat reflector. These have the 
same midpoint halfway between source and receiver, but different 
source-to-receiver offsets. 


m = 2 


r\(z)dz = 2 


(\h 2 (z) - p 2 ) m dz = 2 | (u 2 (z) - p 2 ) y2 dz. 

(47) 


m 


r 


s 


f 



Fig. 3.3-11 Relation between source, receiver, midpoint, and offset 
coordinates measured along the survey line. Any two specify an 
individual seismogram. 



Fig. 3.3-12 An individual trace is characterized by its position in a 
two-dimensional diagram showing its source, receiver, midpoint, and 
offset coordinates. Dots show the traces indicated in Fig. 3.3-10. Physical 
experiments correspond to a common source point (CSP) gather; the four 
traces in Fig. 3.3-10 with the same midpoint form the common midpoint 
(CMP) gather shown. 


Formulating travel time curves as r(p) is useful for some tech¬ 
niques that invert for velocity structure. 

3.3.3 Multichannel data geometry 

A feature of reflection seismology is multichannel geometry, 
the use of multiple source and receiver locations, so that points 
on reflecting interfaces are sampled repeatedly. Figure 3.3-10 
illustrates how such coverage is accomplished by combining 
experiments performed with a seismic source and an array of 
eight receivers at fixed distances from the source. Each time the 
source is activated, eight seismograms, or traces , are recorded. 


The source and the receivers are then moved, and the experi¬ 
ment is repeated, giving eight more traces. Eventually each 
point on the reflector is sampled four times, producing “four¬ 
fold coverage.” 

We assume initially that the velocity structure is layered and 
varies only with depth. Even so, the four seismograms that 
sample the same point are not identical, because they corre¬ 
spond to different source and receiver positions, and thus differ¬ 
ent offset distances between the source and the receiver. Hence 
each trace is a record of displacement, or pressure, as a function 
of time, t , u(s, r, f), characterized by the source and receiver 
positions. 
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Fig. 3.3-13 Schematic of the four different 
gather types. 


Common receiver gather 
r s, s 2 s 3 s 4 s 5 s 6 s 7 



C r 2 r 3 r 4 s-i s 2 s 3 s 4 



Common source gather 




The data are analyzed by grouping the seismograms that 
sampled the same point on the reflector. In this flat-layered 
geometry, these seismograms have the same point, known as 
the midpoint , halfway between the source and the receiver. For 
each midpoint, there is a set of traces with different offsets. The 
midpoint m and offset f are defined in terms of the source loca¬ 
tion s and the receiver position r as 

m = (s + r)/ 2, f=(s-r). (48) 

Thus an individual seismogram is specified by either the source 
and receiver positions, or the midpoint and offset (Fig. 3.3-11). 
These are plotted using two perpendicular axes (Fig. 3.3-12), 
one for the source location and one for the receiver position. 
The midpoint and offset for each seismogram are indicated by 
distance along axes 45° from the s and r axes. Note that the 
scales on these axes differ from the other two. 

To illustrate this relationship, consider the four experiments 
in Fig. 3.3-10, with eight receivers and a single source. Each 
experiment produced data at points, shown by dots, with 
constant source position and successive receiver positions. 
Successive experiments yielded data along a similar horizontal 
line, but displaced by the motion of the source and the receiver. 

The data can be sorted and combined in various ways that 
need not correspond to an actual experiment (Fig. 3.3-13). 
Each experiment corresponds to a set of records with the same 
source position, a common source pointy or GSP, gather. Traces 
with the same midpoint and different offsets can be grouped 
in a common midpoint , or CMP, gather. Similarly, common 
receiver point and common offset gathers can be formed. 

Ordering traces by midpoint and offset makes no distinction 
between a source at position a and a receiver at position b, 
or the reverse. This assumption is justified by the principle of 
reciprocity, by which these two geometrices should produce 
identical seismograms. Thus a common receiver point gather 


can simulate a reversed profile (Section 3.2.2) because, by 
reciprocity, it gives the same data as a common source point 
gather shot in the opposite direction. 

Later in this section, we will discuss a few aspects of the 
data collection process. The sources can be explosives, sound 
sources in water, or vibration sources on land. The source co¬ 
ordinate is thus sometimes referred to as a source point, shot 
point, or vibration point. The receivers are typically single¬ 
component vertical seismometers, known as geophones, for 
land applications, and pressure transducers, or hydrophones, 
for marine surveys. The receiver coordinate is thus often 
termed the geophone coordinate. Generally large numbers of 
receivers, which are themselves groups of receivers, are used. 
Increasingly, data are collected over two-dimensional areas, and 
so are processed to yield three-dimensional velocity structures. 

3.3.4 Common midpoint stacking 

Because the traces in a CMP gather have ideally sampled the 
same subsurface point with different offsets, they can be com¬ 
bined to enhance reflected arrivals. The process begins with a 
set of traces showing the data as a function of offset and time. 
The data contain “signals” of interest, primary reflections from 
interfaces that are used to determine velocity structure with 
depth. The data also contain “noise,” arrivals of no interest, in¬ 
cluding direct waves, head waves, 4 surface waves (sometimes 
termed “ground roll”), and waves from the source that travel in 
the air. The data may also contain arrivals (Fig. 3.3-14) that 
have been reflected more than once, which are known as mul¬ 
tiples, by contrast with the once-reflected primary reflections. 

To enhance primary reflections and suppress everything 
else, we exploit the fact that the arrival times of various signals 

4 In the previous section we focused on direct and head waves, illustrating the adage 
that “one person’s signal is another’s noise.” 



Reflection time 


142 Seismology and Earth Structure 


Primary Double-path Near-surface Peg-leg 

multiple multiple multiple 



Fig. 3.3-14 Geometry of various multiple reflections. (After Kearey and 
Brooks, 1984.) 


vary in different ways between traces as a function of offset 
(Fig. 3.3-15). Reflections have hyperbolic travel time curves, 
whereas direct waves, head waves, surface waves, and air 
waves have linear travel time curves. Other noise may be 
essentially incoherent between traces. 

Consider a reflection whose variation in travel time with 
offset is the normal moveout (NMO), 

T( X ) -t 0 = (x 2 /V 2 +1 2 } 1/2 - 1 0 , (49) 

where t Q and V are the vertical two-way time and rms velocity. 
If each trace is shifted forward in time by the appropriate 
NMO, this reflection appears at the same time for all offsets 
(Fig. 3.3-15). By contrast, arrivals with different moveouts, 
such as the direct wave, do not align. Similarly, multiple 
reflections do not align, because they reflected off shallower 
interfaces than primary reflections with a similar arrival time, 
and thus have a lower rms velocity. This method is similar to 
forming reduced travel time plots (Section 3.2), where a linear 
time shift lines up direct or head waves whose linear travel time 
curve has apparent velocity equal to the reducing velocity. In 
this case, the hyperbolic time shift lines up reflections with 
hyperbolic travel time curves. 

If the traces are added after this time shift, the resulting sum, 
in theory, is the single trace that would have been recorded at 
zero offset, with coincident source and receiver. The reflection 
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Fig. 3.3-15 Schematic example of the normal moveout correction, 
shown for the three arrivals for a single layer. NMO aligns all traces 
(lower panel) in a common midpoint gather by a time shift corresponding 
to the hyperbolic travel time curve of a reflection. The desired reflection 
is thus in phase between traces, whereas other arrivals are out of phase. 
CMP stacking, which adds the traces after this time shift, enhances the 
desired reflection and suppresses other arrivals. 


that was aligned is in phase on all traces, and thus sums con¬ 
structively and gives a strong arrival. By contrast, other arrivals 
will have been shifted such that they are sometimes out of 
phase, and thus sum destructively, yielding weaker arrivals. 
The process of time shifting and then summing the traces with 
different offsets for a given midpoint is called common mid¬ 
point (CMP) stacking . 


Offset —► 



Velocity 



Fig. 3.3-16 Schematic of CMP stacking and velocity 
analysis. Left : Stacking is done for a range of stacking 
velocities, each corresponding to a different hyperbola 
in offset-time space. Right: The peak in the velocity 
spectrum, or power in the resulting stack, shows the 
best stacking velocity. (After Taner and Kohler, 1969. 
Reproduced by permission of the Society of Exploration 
Geophysicists.) 
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Fig. 3.3-17 Example of CMP stacking and velocity analysis. Velocity analysis at different times yields the best stacking velocity as a function o time 
(bottom). The stacking velocity increases with time because later arrivals reflected off deeper interfaces. (After Taner and Kohler, 1969. Reproduced by 
permission of the Society of Exploration Geophysicists.) 


Fig. 3.3-18 Schematic geometry illustrating 
formation of a zero-offset section by 
common midpoint stacking. Each CMP 
gather is stacked over all offsets, as shown 
by the dashed lines like B-B', to produce a 
single zero-offset trace for that midpoint. 
Taken together, these traces form a zero- 
offset section, a plane in midpoint-time 
space, containing arrivals like that shown 
by the solid curve A-F. (After Robinson, 
1983. Migration of Geophysical Data, 

© 1983, p. 24. Reprinted by permission 
of Pearson Education.) 



Real data contain more than one reflection, and the appro¬ 
priate velocities are unknown. Thus the velocities are found by 
stacking with a range of velocities and determining which gives 
the best results. As illustrated in Fig. 3.3-16, traces are stacked 
along hyperbolas corresponding to different velocities. The 
stack output as a function of stacking velocity, known as a 
velocity spectrum, has peak amplitude at the velocity that best 
aligns arrivals on the different traces. This stacking velocity is 
close to the rms velocity if the data are reasonably good and the 
structure is approximately a set of flat layers. 

Because later reflections have higher rms velocities, they 
yield higher stacking velocities. Thus velocity analysis is con¬ 
ducted as a function of time. In Fig. 3.3-17, the best stacking 


velocity, indicated by the maximum in the velocity spectrum, 
increases with time for deeper arrivals. This increase c tunes 
the stacking to bring arrivals with various stacking velocities 
“into focus.” Peaks in the power of the velocity spectrum show 
the arrival of strong coherent reflections. At later times there 
are several peaks, as multiples arrive. Using the stacking veloci¬ 
ties, interval velocities for different depths are found from the 
Dix equation. 

Figure 3.3-18 illustrates the CMP concept geometrically. 
The traces give displacement or pressure as a function of mid¬ 
point, offset, and time, u(m, f, t). CMP gathers can be thought 
of as planes parallel to the offset and time axes, each with the 
appropriate midpoint. Each gather is stacked over all offsets 
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Fig. 3.3-19 Top : Traces with a common midpoint sample the same point 
on a reflector when a reflector and the structure above it are horizontal. 
Bottom : If the structure dips, traces with a common midpoint do not 
reflect at the same point. (After Kearey and Brooks, 1984.) 


to produce a zero-offset trace for that midpoint. These traces 
together form a zero-offset seismic section , u{m, 0, t), a func¬ 
tion of midpoint and time. This section simulates moving along 
the survey line with a single source and receiver at the same 
location, and recording arrivals from below as a function of 
time. Because this process reduces the volume of data dramatic¬ 
ally, there is a tendency to conduct processing operations after, 
rather than before, stacking when possible. 

Often a CMP stack is referred to as a CDP, or common depth 
point, stack. CMP is a better term, because traces with the same 
midpoint have the same reflection point at depth only when a 
reflector and the structure above it are flat-lying (Fig. 3.3-19). 
This effect is generally small enough that CMP stacking is 
useful. We will discuss shortly the limitations on reflection 
studies due to deviations from the ideal flat geometry. 

A seismic section is in some ways similar to a “picture” of the 
subsurface. Major arrivals in the data generally represent sig¬ 
nificant reflectors at depth, and can be correlated with geologic 
structure. As a result, analysis of seismic reflection data is a 
powerful geological tool. For example, Fig. 3.3-20 {top) shows 
a seismic section across the Peru trench. Data of one polarity 
are black, making coherent reflectors more visible. The inter¬ 
pretation ( bottom ) indicates the top of the crust of the sub¬ 
ducting Nazca plate, including small grabens, and complex 
structures in the overlying accretionary prism. 



Fig. 3.3-20 Migrated seismic section across the Peru trench, showing the subducting Nazca plate dipping to the right. The data were collected with air gun 
sources shot at 35 m intervals and recorded by a 1600 m-long array with 24 hydrophone groups. The data were sampled every 4 milliseconds. (After Von 
Tuene et al ., 1985. /. Geophys. Res., 90, 5429-42, copyright by the American Geophysical Union.) 
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Fig. 3.3-21 Reflection data {left), showing muting {right) to eliminate the head waves that arrive first and the large surface waves that arrive 
later. (After Claerbout, 1985.) 


3.3.5 Signal enhancement 

The best hope of reducing artifacts in a seismic section due to 
noise and other difficulties is to exclude them before stacking. 
Thus, as in many signal processing applications, the idea is to 
identify characteristics of the “noise” we seek to reject, and use 
those characteristics to exclude it. 

For example, variations in the thickness of a near-surface 
low-velocity layer due to weathering produce arrival time 
variations. Similar variations can result from sea floor topo¬ 
graphy, because the water is a low-velocity material of varying 
thickness, or from elevation changes along a land survey. These 
shifts can cause the travel time of reflections to deviate from the 
hyperbolic moveout with offset assumed in stacking, and hence 
degrade a stacked section and produce spurious relief on a 
deeper reflector. To minimize these problems, a static time cor¬ 
rection :, shifting traces back or forward in time, can be applied. 

Direct waves, head waves, surface waves, air waves, and the 
like are often identifiable on CSP gathers from their arrival 
times and linear travel time curves. Data corresponding to the 
time-distance ranges in which the undesired arrivals appear 
can be set equal to zero, or muted before the gathers are stacked 
(Fig. 3.3-21). 

Another approach to isolating reflections uses the fact that 
the apparent velocity along the surface, 

c x -1/p = v/sm i = co/k x , (50) 

is higher for reflections, which have angles of incidence close 
to the vertical, than for surface or air waves. Hence the reflec¬ 
tions have a longer apparent wavelength along the surface, 
X x = IkcJcd. Thus the effects of surface waves can be reduced by 
summing a group of receivers to produce a single trace. Arrivals 


with wavelengths shorter than the length of the group interfere 
destructively and are reduced in amplitude, enhancing the 
longer-wavelength reflections. Hence traces from a single 
source-receiver pair are often actually a sum of a number of 
geophones or hydrophones. In this way, the data collection pro¬ 
cess, rather than subsequent analysis, enhances the reflections. 

Differences in the apparent velocity can also be used to en¬ 
hance reflections after the data are collected. In this approach, 
arrivals with different apparent velocities on common source 
gathers are separated by velocity filtering , using a double 
Fourier transform. As we saw in Section 2.8.2, and discuss fur¬ 
ther is Chapter 6, the Fourier transform and inverse transform 
relate a function of time f{t) and its transform F{co ), a function 
of angular frequency, 

F(ffl) = f(t)e- ia, dt f(t) = — F((0)e im d(0. (51) 

2 K 


Similarly, because the wavenumber is the spatial frequency 
(Section 2.2.2), it is related to the distance in the same way 
that angular frequency is related to time. Hence, a function of 
the horizontal distance g{x) and its corresponding function 
of horizontal wavenumber G{k x ) are related by the Fourier 
transform pair 


G(jy= \g(x)e^dx g(x)=^- G{k x )e~ ikxX dk x . (52) 

J 2;r 

By convention, opposite signs are used in the exponentials for 
the time and space transforms. 
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A gather u(x, t) is the displacement as a function of hor¬ 
izontal distance and time, so the double Fourier transform, 


Fig. 3.3-22 Velocity filtering by Fourier 
transformation into the horizontal 
wavenumber and frequency domain. 
Top: Positions of reflected waves, noise, 
air waves, and surface waves in the (k xJ f) 
plane. Slopes correspond to lines of 
equal apparent velocity (in ft/s). (After 
Kanasewich, 1981.) Bottom : Common 
source gather before and after velocity 
filtering. Surface waves have been 
suppressed by removing low apparent 
velocity data, thus enhancing reflections. 
(Hosking Geophysical.) 


to zero, and inverse transforming the data back to (x, t) space, 
using the inverse of the double Fourier transform 


U(k x , (o) 


u{x, t) exp [i(-cot + k X x)]dxdt, 


u(x, t) 


U(k x , cd) exp [i(cot~~k x x)]dk Y dco. 


converts it to the horizontal wavenumber and angular fre¬ 
quency domains. Plotting the transform as a function of k x 
md at (or, equivalently, k x and frequency/) separates the data 
into portions of different apparent velocity, because a given 
velocity, = co/k x , plots as a straight line (Fig. 3.3-22). It is 
:hus possible to suppress arrivals with a given range of appar¬ 
ent velocities by setting the data in some region of (k , co) space 


Rather than having an abrupt boundary, the data at the edges 
of the portion of the (k x , co) space of interest are tapered 
smoothly to zero for reasons discussed in Chapter 6. 

Thus the double Fourier transform converts data contain¬ 
ing arrivals that overlap in the (x, t) domain into the (k x , co) 
domain, where the arrivals have distinct properties that make 
it easy to separate them. This separation is exploited to filter 
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Fig. 3.3-23 Schematic illustration of slant 
stacking: data are summed along lines in the 
{x, t) plane {left) corresponding to values of 
intercept rand slope p, and so yield points 
in the {%, p) plane {right). 
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the data, which are then transformed back to (x, t). Velocity 
filters are also called dip filters , because they separate arrivals 
based on their slope (dip) in the {k x , CD) domain. A variant 
of this method for data recorded in two spatial dimensions, 
u{x, y, t), is to take the triple Fourier transform U(k x , k y , co). 
Because the transform is in terms of both components of the 
horizontal wave vector, it can be filtered to suppress arrivals 
coming from certain directions. 

Another approach to transforming data such that compon¬ 
ents are more easily separated uses the intercept-slowness 
formulation of travel time curves. As discussed in Section 3.3.2, 
the function T(p) describes each point on a travel time curve 
T{x) by the time axis intercept T and the slope p of the line 
tangential to the curve at that point. Thus seismic data can be 
described as functions either of position and time, u{x, t), or of 
slope and intercept, u(% p). To transform from one representa¬ 
tion to the other, the data u(x, t) are summed along lines of 
constant slope in the (x, t) plane, which correspond to values of 
intercept t and slope p (Fig. 3.3-23), 

u{r,p)= u{x,'t+px)dx. (55) 

This integral, which maps all the data along each slanted line in 
{x, t) to a point in (r, p), is called a slant stack , or Radon trans¬ 
form of the data. It is also called a plane wave decom¬ 
position, because it decomposes the data according to p, 
the reciprocal of the apparent velocity of a plane wave. The in¬ 
verse slant stack operation that transforms the slant stack back 
into the [x, t) space can be written 5 

5 Claerbout (1985). 


u{x, t) = lit 2 * 


In 


u{t-px,p)dp, 


(56) 


where is the convolution operation, discussed shortly. This 
expression is similar to a slant stack in the (t, p) plane, because 
data are summed along a line of constant t. 

All the data are mapped from one domain into the other, 
so no data are lost by this transformation. Thus, after slant 
stacking, we can use the fact that the T [p) representation of 
the travel time curve is in some ways simpler than the T(x) 
representation. Because different arrivals fall in different parts 
of the (t, p) plane (Fig. 3.3-8), undesired arrivals can be sup¬ 
pressed by zeroing portions of the data. For example, the 
gather in Fig. 3.3-24 shows a strong surface wave, the late- 
arriving linear arrival with an apparent velocity of about 
1.35 km/s and intercept about 0. In the usual (x, t) space, 
it would be hard to filter out this arrival without suppressing 
the reflections. After slant stacking, this arrival shows up as a 
region of large amplitude with r ~ 0 and p = 1/1350 s/m « 
740 jis/m. Once the slant stack is filtered by eliminating all 
data with p > 650 ps/m and inverse transformed, the surface 
wave is significantly reduced. In practice, rather than having 
an abrupt boundary, the data at the edges of the portion of the 
(t, p) space of interest are tapered smoothly to zero for reasons 
discussed in Chapter 6. 

The slant stack and velocity filtering with the double Fourier 
transform are related, because both exploit properties of the 
data associated with the apparent velocity. As a result, slant 
stacking can be done by transforming data to the [k x , co) 
domain, evaluating the transform for constant values of the ray 
parameter, and then inverse transforming to the time domain. 
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, h 8 ' 3 ; 1 1f , Le f Common source point gather of Vibroseis data from Alaska, showing prominent late-arriving surface waves with an apparent velocity of 
about 1.35 Icm/s and intercept about 0. Center. Slant stack of the data. The? axis is labeled both with values of p (ps/m) and apparent velocity (km/s) The 
surface waves appear as a region of large amplitude with r = 0 and p » 740 ps/m. Right: The inverse slant stack, after suppression of data with p > 650 ps/m 
shows the surface wave significantly reduced. (Tatham, 1989. With kind permission from Kluwer Academic Publishers ) P ^ ’ 



Fig. 3.3-25 Left: Schematic of an air gun, a 
common marine seismic source. (Fig. 3.18 
in Kearey and Brooks, 1984, redrawn with 
permission of Bolt Associates and Sodera 
Ltd.) Right: Source wavelets (pressure 
versus time) for a single air gun and an 
array of air guns. The array reduces the 
bubble pulse and makes the wavelet 
more impulsive, though it still contains 
additional unwanted complexity. (Fig. 3.19 
in Kearey and Brooks, 1984. Redrawn with 
permission of Bolt Associates.) 


3.3.6 Deconvolution 

Another useful technique, deconvolution , “sharpens” the 
reflections from interfaces. Ideally, each reflection would be a 
sharp pulse approximating a delta function, so the arrival time 
□f the reflection and the depth of the reflector would be deter¬ 


mined precisely. The sharpness of the reflected pulse determines 
vertical resolution: how close in travel time, and thus depth, 
two interfaces can be and still give distinct reflected arrivals. 

Seismic sources do not generate delta function signals. Fig¬ 
ure 3.3-25 shows the signal produced by an air gun, a common 
source used in marine surveys. The damped oscillation results 












33 Reflection seismology 149 



Vibroseis master sweep signal 




Synthetic Vibroseis field trace 


Fig. 3.3-26 Schematic geometry of a Vibroseis survey (top) and sweep signal (center). The field records (bottom) contain interfering reflections off various 
interfaces, and so require processing to identify individual reflections. (With permission of Conoco.) 


from expansion and contraction of the air bubble that the gun 
injects into the water. The signal can be sharpened using mul¬ 
tiple air guns offset in time, which interfere to give a sharper 
pulse. Figure 3.3-26 shows the “sweep” signal generated by a 
Vibroseis 6 unit, a truck-mounted seismic source used in land 
surveys. The signal extends for a period of time T (typically 7— 
35 s) over which the frequency varies through a range 
generally within 10—60 Ffz. Such signals, also called chirps, 
can be written 

w(t) = cos 2 n^f t t + ^5—Alf 2 . (57) 

Because the duration of the sweep is often longer than the differ¬ 
ence in travel time between interfaces, the resulting seismogram 
is a complicated combination of sweep signals with different 
amplitudes and time delays reflected from different interfaces. 

6 Vibroseis is a trademark of the Continental Oil Company. The first such continu¬ 
ously operating variable frequency seismic source was invented by Selwyn Sacks in his 
Ph.D. thesis in 1961. 


Thus reflection data, like any other seismograms, include the 
effects of both the source and the structure. Separating these 
effects is a basic theme in seismology, because we are usually 
interested in either the source (as for earthquakes) or the struc¬ 
ture, as in this application. To separate source and structure, 
we describe a seismogram, s(f), as resulting from the source 
pulse, known in reflection applications as a wavelet, w{t), and 
a time series that describes the effects of the structure, in this 
case a reflector series, r{t). 

To find the reflector series, we recall from Section 2.6.7 that 
a wave with initial unit amplitude acquires an amplitude equal 
to the product of the reflection and transmission coefficients 
along its path. Thus, for a set of layers with velocity v f and 
thickness h, the amplitude of the primary reflection from the 
bottom of the z th layer is the product of the reflection coefficient 
at the base of the layer times all the transmission coefficients for 
both the up and down parts of the path, 

R„ +1 fl T /, + i T , +1 /> (58) 

;=o 
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Fig. 33-27 Schematic of a ray path through several interfaces, 
showing how the amplitude depends on the product of the reflection 
and transmission coefficients along the path. 


where n denotes the product of the indicated terms. For 
example, the reflection off the base of the second layer 

has amplitude ^23 ^ 01 ^ 10 ^ 12^21 = ^ 01 ^ 12 ^ 23 ^ 21 ^ 10 ’ w ^ ere 
the second form shows the order of interactions along the 
path (Fig. 3.3-27). In dealing with reflection data, the vertical 
incidence reflection and transmission coefficients are generally 
suitable approximations. Hence, the reflection and transmis¬ 
sion coefficients are given by the densities and velocities at each 
interface 


R 


i i +1 


_ PjVj ~ Pj+l v j+l 


Tn +1 - 


2 Pi v i 


Pi v i + Pi + i v i + i PM + p i+1 v i+1 

and the reflection arrives at a two-way travel time of 

*,=2iA 

;-o v i 


(59) 


(60) 


which is the sum of the vertical travel times in each of the 
layers. Thus the reflector series for primary reflections off a set 
of N layers is a sum of impulses, each corresponding to the 
reflection from the bottom of the z th layer, 

N i- 1 

r(t) = J J 8(t-t l )R il+1 YlT ij+1 T i+lj . (61) 

i=0 /•=0 


5(t- t-) is the delta function, a spike in time that is zero at all 
rimes except £-, when it equals 1. The reflector series is thus a set 
}f spikes with the appropriate amplitude and arrival time, each 
;orresponding to a specific reflection. 

We will see in Chapter 6 that the resulting seismogram is 
pven by an operation known as the convolution of w(t ) and 
it), which is written 


>(t) = w(t) * r(t) = 


w{t-r)r(r)dr. 


( 62 ) 



Geological Reflector Input Seismic 

section series ' pulse ~ trace 



Fig. 3.3-28 A reflection seismogram can be viewed as the convolution 
of a source wavelet with a reflector series representing the structure. 
The reflector series has impulses at times corresponding to the arrival 
times of reflections with amplitudes given by the reflection coefficients. 
Deconvolution attempts to “spike” the wavelets in the data, revealing 
the reflector series. (After Kearey and Brooks, 1984.) 


because the Fourier transform of a convolution equals the 
product of the Fourier transforms, 

S(co) = W((d)R(g)). (63) 

As shown schematically in Fig. 3.3-28, the convolution yields a 
trace in which the source wavelet appears at times corres¬ 
ponding to the spikes in the reflector series, with the appropri¬ 
ate amplitudes. If the time between the spikes corresponding to 
individual reflectors is shorter than the duration of the wavelet, 
interference can give a complicated signal. 

These expressions show why it would be desirable to have a 
delta function source wavelet, because the Fourier transform of 
a delta function is simply 1. Thus, if w(t) = 8(t), the seismogram 
would equal the reflector series. Although a physical source 
wavelet is not a delta function, the seismograms can be manip¬ 
ulated mathematically to simulate such a wavelet. This can be 
done by creating an inverse filter 7 w~ x d), that, when convolved 
with the wavelet, yields a delta function 

w~ l (t) *w(t) = 8(t)> (64) 

Applying this filter, which “spikes” the wavelet, leaves only the 
reflector series 

w~ x d) * sd) = ur x d) * wd) * r(t) - rd)' 


This equation defines convolution in the time domain. Con- 
r olution can also be described in the frequency domain, 


7 Thu notation w '(?) does notmean \lw[t). 


( 65 ) 
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This works well except at frequencies where the source wave¬ 
let’s spectrum is small. Deconvolution makes the arrivals from 
reflectors stand out more distinctly (Fig. 3.3-29) and easier to 
interpret. 

An alternative, but similar, approach is used with Vibroseis 
data for which the wavelet is very long. The goal is to identify 
times in the trace when the sweep signal arrives. Similarities 
between two time series f(t) and g(t) are shown by their cross¬ 
correlation , an operation (Section 6.3.4) defined by 


c(L) = lim — 
T->« T 


f(t + L)g(t)dt. 


( 68 ) 
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Fig. 3.3-29 Top : Seismic section before deconvolution. Bottom: Seismic 
section after deconvolution, showing sharper arrivals for the major 
reflections. (Yilmaz, 1987. Reproduced by permission of the Society of 
Exploration Geophysicists.) 


Because this operation is the inverse of convolution, it is called 
deconvolution. 

To create the inverse filter, note that the Fourier transform of 
the convolution (Eqn 64) yields 

W- 1 (co)W(o)) = 1, ( 66 ) 

so the transform of the inverse filter is just 1 IW{a>). Hence 
deconvolution can be done by dividing the Fourier transforms 

S{0)/W{co) = R{co). ( 67 ) 


The cross-correlation is largest as a function of L, the lag time, 
when the series are most similar. For finite time series, the 
integration is over the times when f and g are nonzero. A special 
case is the auto-correlation, the cross-correlation of a function 
with itself 


T 


a(L) - lim — 
v t-»« T 


f{t + L)f(t)dt , 


-T 


(69) 


which is always maximum at zero lag. The auto-correlation of 
a Vibroseis sweep, called a Klauder wavelet, is sharply peaked 
at zero lag (Fig. 3.3-30). Thus cross-correlating a sweep with 
the recorded trace is similar to using a spiking filter, because it 
produces sharp spikes when reflections arrive (Fig. 3.3-31). 
This similarity is not surprising, because cross-correlation 
and convolution are similar operations (compare Eqns 62 
and 68). 

Reflections can also be enhanced by filtering in the frequency 
domain to enhance certain frequency ranges and reject others. 
The frequency response of geophones varies, but the records 
may contain frequencies as low as a few Hz and in excess of 
100 Hz. As a result, the signal-to-noise ratio can vary signific¬ 
antly as a function of frequency, so filtering often improves 
reflection quality. The appropriate frequencies may change 
with time in the record. For example, the later-arriving reflec¬ 
tions have longer periods because high-frequency energy is lost 
to attenuation, the process by which seismic energy is converted 
to heat (Section 3.7). 



Klauder wavelet 



Lag time 


Fig. 3.3-30 The auto-correlation of a Vibroseis sweep signal is an impulsive Klauder wavelet. 
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Fig. 3.3-31 A Vibroseis record is a sum of sweep signals reflected from 
various interfaces. Cross-correlation with a sweep signal produces 
Klauder wavelets at the reflection times. (Conoco.) 


3.3.7 Migration 

Given the cleanest” possible seismic section, how good an 
image of the subsurface is it? Ideally, the section produced by 
CMP stacking is a zero-offset section, because the traces have 
been converted to what would be recorded for a coincident 
source and receiver. The ray path down to a reflector and back 
up must be the same, so Snell’s law requires that this path 
be normally incident on the reflector. If the structure were 
composed of horizontal interfaces, the reflection paths would 
be vertical, and the time section could be converted to a depth 
section by using the velocities to scale the time axis (Fig. 3.3-32, 
left). In this case, a reflection’s arrival time indicates depth to a 
reflector directly below the source and receiver. 


For more complex structures, things get trickier. If interfaces 
are not horizontal, although the ray paths are the same up and 
down and intersect the interface at right angles, the path need 
not be vertical (Fig. 3.3-32, center ). Moreover, there are several 
paths from a single source-receiver pair to a reflector. The rela¬ 
tion between the zero-offset time section and the structure is 
thus more complicated. 

To deal with these questions, we consider the wave field u(x, 
z, t) , the displacement as a function of position and time during 
a seismic experiment. The traces are the data at the surface, 
u(x, z = 0, t). The question of what the traces show about 
the subsurface can be addressed via a theoretical exploding 
reflectors experiment, in which seismic sources on the reflectors 
explode at time zero (Fig. 3.3-32, right). Waves propagate 
upward from the reflectors and are recorded at the surface. The 
reflectors do not interact further with the waves, so multiple 
reflections are not generated. The sources have strength pro¬ 
portional to the reflection coefficients, so the amplitudes at 
the surface are correct. Finally, to correct for the fact that the 
actual reflections went both up and down, times on the re¬ 
corded traces are divided by two. The recorded data can thus 
be thought of as resulting from the explosion of the reflectors. 

The recorded data are directly related to the structure at 
depth. At£ = 0, the instant the sources explode, the wave field at 
depth, u(x, z, 0), is exactly the geometry of the reflectors, and 
thus the desired image of the subsurface. These waves pro¬ 
pagate upward to the surface z = 0, and are recorded as the 
seismic section u(x, 0, t). Hence the reflectors can be found 
from the section by removing the effects of propagation, using 
an operation called migration . 

We first consider a constant-velocity medium in which a 
point source at (x 0 , * 0 ) explodes at t = 0. The resulting dis¬ 
placement is a circular wave front (Section 2.4.3) that expands 
with time at a rate equal to the velocity (Fig. 3.3-33) and is 
described using a delta function 

u(x, z, t) = S((x-x 0 ) 2 + (z-z 0 ) 2 - (vt)\ (70) 
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' 8 ' Tk 2 , Thr , ee ldeallzed s i el ™ lc ^flection experiments. Left : A zero-offset seismic section for a flat-layered medium. The only reflection points are 
irectly below the source and the receiver Center : A zero-offset seismic section for a medium with a nonhorizontal interface. Although the upgoing and 
owngoing ray paths are the same, the reflection points need not be directly below the source and the receiver. For a given reflector, several ray paths can 
" “ ; arr ‘ Va i S at 3 7 gle rec , eiver - conceptual model in which reflectors explode, giving a wave field with the geometry of the reflectors that 

-ction (After'CkerkT 1985)“ ° bSerVed Se ‘ Sm ‘ C SeCtl ° n ' Migration seeks t0 rererse this P™cess and find the initial wave field from the seismic 








3.3 Reflection seismology 153 



Fig. 3.3-33 Effect of a point source or diffractor. Left: The point 
diffractor acts as a source of spherical (circular in two dimensions) wave 
fronts. Right: The resulting seismic section with time scaled by velocity 
so that the vertical axis has the dimensions of distance. The diffraction 
appears as a hyperbola. The point at the apex corresponds to the true 
position of the diffractor. The other points are due to later arrivals from 
the diffractor at nonvertical incidence, so their positions on the section 
do not indicate a reflector directly below the receiver. 

whose downgoing half we ignore. The resulting seismic section 
is the wave field at the surface, 2 = 0, 

u(x, 0 , t) = 8{{x- x 0 ) 2 + {z 0 ) 2 - ( vt ) 2 ). (71) 

This is a hyperbola with apex at (x = x 0 , t = Z 0 /v), showing 
that the wave front arrives first directly above the source, and 
arrives later at points farther away. Thus the arrival seen on 
the seismic section is not equivalent to geologic structure with 
depth. A way to visualize the relation between the source posi¬ 
tion and the seismic section is to plot the time axis in units 
of vt , giving a time scale equal to the propagation distance. 
Thus an arrival time equals the distance along the true path 
from the source to the receiver. As illustrated, the depth of the 
source is shown correctly on the section only by the arrival time 
at a receiver directly above the source. For all other points on 
the surface, the arrival appears at a time corresponding to the 
travel time to that point, along a path that was not vertical. 
Hence, except above the source, the arrival on the section does 
not correspond to a source directly below the receiver, and the 
arrival time does not give the source depth directly. 

The hyperbolic arrival on a seismic section due to a point 
source at depth is called a diffraction hyperbola. It lets us 
understand how complicated structures appear on seismic 
sections, because by Huygens’ principle (Section 2.5.10) the 
reflection from an interface can be found by treating the inter¬ 
face as a set of point sources. The resulting reflection is found by 
summing the wave fronts from these Huygens’ sources, which 
are also called point diffractors, or point scatterers. Because 
each source produces a diffraction hyperbola on the seismic 
section, the section resulting from a set of point diffractors 
is the sum of their diffraction hyperbolas. In considering this 
sum, we use the results of a more sophisticated analysis show¬ 
ing that the diffraction hyperbola’s amplitude is largest at the 
apex and decays as the cosine of the angle off to the sides. Be¬ 



Fig. 3.3-34 Diffraction hyperbola with true amplitudes. 

(After Claerbout, 1985.) 

cause Eqn 71 describes only the travel times, it does not include 
this term, whose effect is visible in Fig. 3.3-34. 

To illustrate this approach, consider an interface dipping at 
an angle f3. This should give the same reflections as a line of 
closely spaced point diffractors, so the seismic section will be 
a sum of the resulting hyperbolas. As shown in Fig. 3.3-35, 
the hyperbolas interfere constructively, causing an apparent 
interface. Interestingly, this apparent interface does not pass 
through the apex of each hyperbola, so it is displaced from 
the real interface and appears to have a shallower dip angle, a. 
Because the scaled travel time to the true interface equals the 
scaled arrival time on the trace, the real and apparent dips are 
related by sin /?=tan a. 

Seismic sections from simple structures can appear quite dif¬ 
ferent from the actual structure. For example, consideration 
of ray paths for a single reflector with a synclinal structure 
shows that several arrivals from different points on the reflec¬ 
tor appear on a single zero-offset trace, each with a different 
travel time. As a result, an apparent anticlinal, or “bowtie,” 
structure appears (Fig. 3.3-36). Another common effect is that 
the edges of sharp interfaces can give rise to long diffraction 
“tails” (Fig. 3.3-37). This effect is analogous to diffraction at 
the edges of a slit (Fig. 2.5-18). 

The goal of migration is to undo the effects of diffraction 
and hence convert the data to realistic images of the subsurface. 
Migration can thus be thought of as an inverse scattering 
or inverse diffraction problem. Because this requires removing 
propagation effects, migration methods are derived using 
forward models of the propagation process. The idea that the 
section is the sum of diffractions suggests one approach known 
as diffraction sum migration , or Kirchoff migration. Because 
point diffractors cause hyperbolas on the seismic section, the 
amplitude of each point on the migrated section is found by 
summing the unmigrated section with appropriate scaling along 
hyperbolic trajectories (Fig. 3.3-38). This operation should 
collapse all the signal in diffraction hyperbolas to points at 
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Fig. 3.3-35 A dipping layer can be modeled as a line of point diffractors. 
On a time section, interference between the diffraction hyperbolas 
produces an apparent reflector, as shown schematically {top) and with true 
amplitudes {bottom). Because the scaled travel time to the true interface 
{vt r ) equals the scaled arrival time on the trace {vtj, the apparent dip a:is 
shallower than the true dip /3. (After Claerbout, 1985.) 



Depth 


(b) A B Distance 



Fig. 3.3-36 Top: Illustration of several ray paths for reflections at zero 
offset from a reflector with a synclinal structure. Bottom: Because these 
arrivals, each from a different point on the reflector, have different travel 
times, the time section shows an apparent anticlinal, or “bowtie,” 
structure. (Kearey and Brooks, 1984.) 

their apexes, and thereby reconstruct the reflectors as a set 
of point diffractors. Thus diffraction artifacts like those in Figs 
3.3-36 and 37 should be removed, and apparent interfaces with 
shallow dips should be converted to interfaces with the steeper 
true dips. Figure 3.3-39 shows the improvement to a seismic 
section from Kirchoff migration. The resulting migrated time 
section can be converted to a depth section using a velocity™ 
depth function. 

The appearance of a migrated section depends on the 
assumed velocity. Using a too-slow velocity reduces the length 



Fig. 3.3-37 The ends of truncated interfaces 
{left) act as diffractors, so a time section 
{right) shows spurious down-dip extensions 
of the interfaces. (Claerbout, 1976.) 

(http:// sep www. Stanford. edu/sep/prof/) 
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Kirchoff migration 



Fig. 3.3-38 Diffraction sum migration reverses the effects of diffraction 
by summing the time section along hyperbolic trajectories, thus collapsing 
hyperbolas to their apexes. (Schneider, 1971. Reproduced by permission 
of the Society of Exploration Geophysicists.) 



Fig. 3.3-39 Time section before {top) and after ( bottom) migration. 
Elimination of the diffractions produces a better image of structures at 
depth, such as those at about 1.8 s, where bowties and diffraction “tails” 
have been suppressed. (Prakla-Seismos) 


of hyperbolas’ “tails,” but does not fully collapse them, and so 
is termed undermigration. Similarly, a too-high velocity over¬ 
migrates the data, converting upward-pointing hyperbolas 
into downward-pointing ones. As a result, correct imaging of 
dipping structures depends on an accurate velocity model. 

Other migration methods, called wave equation migration, 
use a double Fourier transform to map a wave field, u(x, z , t), 
from the horizontal distance and time {x, t) domain to the 
horizontal wavenumber and angular frequency (k x , co) domain. 
The transform is 


U(k x , z, co) = 


u(x,z , t) exp [i{-cot + k x x)]dxdt, 


(72) 


with inverse transform 


u(x, Z, t) = 


1 

An 1 


U{k x , z, co) exp [i(cot-k x x)]dk x dco. 


(73) 


If we consider only P waves, the wave field u(x, z, t) satisfies the 
wave equation in two dimensions: 


d 2 u d 2 u _ 1 d 2 u 
dx 2 dz 2 v 2 dt 2 


(74) 


The corresponding condition on the transform U{k x , z, co) is 
found by substituting the inverse transform for u, taking the 
derivatives, and canceling, yielding 


d 2 U 

dz 2 


/ 


\ 


k 2 - — 

x v 2 


U . 


(75) 


Because the components of the wavenumber vector are related 
by 


| k 1 2 = kl + kl=co 2 lv 2 , (76) 

the transform satisfies 


— = -*T- 

dz 1 


(77) 


If the velocity is constant with depth, k z is independent of 2 , so 
integrating Eqn 77 yields 

U(k x , z, co) = U(k x , 0, co) exp [±ik z z]. (78) 

This equation relates the wave field at the surface and at any 
depth. The operation of converting one to the other is called 
downward or upward continuation of the wave field. The sign 
of the exponential distinguishes upcoming from downgoing 
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waves. Because z increases downward, upcoming waves occur 
when k z and co have the same sign. To ensure this, we define k 
as a function of co , 


K) = s B n (co)^oo 2 lv 2 - k \, ( 79 ) 

where the function sgn (co) is 1 when co is positive, and is -1 
when co is negative. Using this definition, the inverse transform 
Eqn 73 becomes 


u(x, z, t) = 


1 

An 2 


r 

U(k xi 0, co) exp [i(cot-k x x 



+ k x )z)]dk x dco. (80) 

This integral relates the Fourier transform of the seismic 
section recorded on the surface, z = 0, U(k x , 0, co), and the 
upcoming wave field at depth at earlier times. By the exploding 
reflector model, the image of the subsurface is the wave field at 
t=0, when the reflectors have just exploded. Thus the image 
can be found by setting t = 0: 

u{x, z , 0) = —— f f U{k x , 0, co) exp [i(-k x x 


Fig. 3.3-40 Schematic illustration of the relation between processing 
operations for reflection data. Deconvolution applied along the time 
axis increases temporal resolution. CMP stacking along the offset axis 
collapses the data to the midpoint-time plane (compare to Fig. 3.3-18), 
yielding a seismic section and enhancing reflections. Migration applied 
in this plane improves lateral resolution. (After Yilmaz, 1987.) 


to account for the variation in velocity with depth. Often in 
complex geology horizontal variations in velocity are import¬ 
ant, so migration can be conducted with numerical methods 
that can propagate waves through laterally varying media. 


+ k z (co,k x )z)]dk x dco. (81) 

Although this integral migrates the transform of the seismic 
section into the desired image, the integral over co and k x has to 
be done separately to find the image at every depth z. A way to 
get around this is to replace the co integration with one over k , 
by expressing co as a function of k x and k z , 

co(k x , k z ) = sgn (82) 

and changing variables using 


dco _ k z v 


(83) 


This change converts Eqn 81 into an inverse Fourier transform 
from the wavenumber {k x , k z ) domain to the space (x, z) domain 


<4x, z, 0) - j* j* U(k x , 0, co(k x , k z )) exp [i(-k x x 


+ Kz)] 


-jn dk * dk » 
+ k 2 z 


(84) 


;o inverting the double transform once gives the image for all 
c and z. 

The application of migration methods to data involves vari- 
>us complexities. The time axis in the section can be scaled 


3.3.8 Data processing sequence 

The various processing operations for seismic reflection data 
can be combined in different ways. To illustrate this, we sum¬ 
marize a common sequence for some of the possible operations. 
For simplicity, the discussion is in terms of one horizontal 
dimension, but the approach applies to two dimensions. 

Preprocessing consists of initial steps. Because data from dif¬ 
ferent receivers are recorded simultaneously, they are reorgan¬ 
ized (demultiplexed) to produce a trace for each receiver. The 
traces are then edited to eliminate effects such as noisy traces 
or recording errors. Static time shifts are applied when needed 
(Section 3.3.5). The amplitudes are then adjusted using a gain 
recovery function that corrects for the fact that the later arrivals 
have lower amplitudes because of reflections, transmissions, 
geometric spreading, and attenuation (Section 3.7). The data 
are combined into common source gathers, and can then be 
viewed as a volume defined by the time, offset, and midpoint 
axes (Fig. 3.3-40). They can then be filtered using methods 
(Section 3.3.5) including muting of undesired arrivals, band¬ 
pass filtering to enhance or suppress certain frequencies, and 
velocity or slant stack filtering to suppress certain arrivals. 
Deconvolution (Section 3.3.6), which improves the time reso¬ 
lution of the data, can be viewed as acting along the time axis in 
Fig. 3.3-40. 

As this point, common midpoint stacking and velocity 
analysis are conducted for the gathers. These operations (Sec¬ 
tion 3.3.4) combine data for each midpoint to produce a seismic 
section that approximates what would be recorded at zero off¬ 
set. Geometrically, this acts along the offset axis to collapse all 
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the data to the midpoint-time plane. Migration (Section 3.3.7) 
in the midpoint-time plane seeks to eliminate artifacts due to 
diffractions and convert the seismic section to an image of the 
subsurface. The migrated section can then be converted, using 
assumptions about velocities, to a depth section. The depth 
section is then interpreted together with geological data and 
other types of geophysical data, in some cases from drill holes, 
to understand the subsurface geology. 

This discussion of the processing sequence brings out the 
point that although it is natural to treat a seismic section as an 
accurate image of the subsurface, it is actually a display of a 
seismic wave field showing the energy arriving as a function of 
two-way travel time. Thus the quantity shown, vertical dis¬ 
placement or pressure, need not correspond to any geological 
reflector of interest. Large arrivals can result from interference 
between reflections from small impedance contrasts. More¬ 
over, because a seismic section has been produced by math¬ 
ematical operations, rather than the physical experiment it 
simulates, noise in the data and errors in the processing can 
produce spurious artifacts. For example, the conversion of time 
to depth is only as accurate as the velocities found by stacking 
or otherwise, perhaps from measurements in a drill hole. 

As we have seen in discussing migration, seismic sections 
are most likely to deviate from the desired images when the 
medium has significant lateral variations. For example, a 
medium with random heterogeneities can yield spurious short 
layered segments, because the reflected energy depends on the 
vertical changes in impedance. Thus long-wavelength vertical 
variations in impedance are suppressed, whereas both short- 
and long-wavelength horizontal variations are preserved, and 
so can yield a structure with apparent horizontal layering. This 
effect can be viewed as a velocity filter (Section 3.3.5) that 
reduces horizontal resolution for structures with steep dips. 
Similar effects, which are prone to occur at large offsets, may 
contribute to the horizontally discontinuous layering observed 
in deep crustal reflection data (Section 3.2.4). Hence, as we 
will see in various contexts throughout our discussions (e.g., 
Section 7.3), studying three-dimensional velocity structure is 
an interesting and challenging enterprise. 



Fig. 3.4-1 Geometry of Snell’s law for a spherical earth. 

3.4.1 Ray paths and travel times 

By analogy to the way we previously represented the earth using 
uniform flat layers, we now treat it as a series of concentric 
spherical shells of uniform-velocity material. The ray paths and 
travel times for the spherical geometry are described by expres¬ 
sions similar to those for flat layers (Section 3.3.1). Consider 
the portion of a seismic ray’s path connecting points at radial 
distances q and r 2 from the earth’s center (Fig. 3.4-1). If v 1 and 
v 2 are the velocities above and below q, and q, i 1 and i 2 are the 
angles shown, then by Snell’s law 

q sin q _ q sin i[ ^) 

v \ V 2 

However, r 1 sin i\ - r 2 sin q because both equal the length ON, 
so we rewrite Eqn 1 as 

q sin q = r 2 sin i 2 (2) 

V 1 V 2 


Thus we define the ray parameter p for a spherical earth as 


3.4 Seismic waves in a spherical earth 

In the previous sections, we developed the theory to use the 
travel times of seismic waves to study the velocity structure of a 
medium composed of flat layers. This analysis is useful when 
the ray paths between the source and the receiver are short 
enough that the earth’s curvature can be neglected. Because this 
is the case for distances less than a few hundred kilometers, 
such analysis is used to study structure in the crust and the 
uppermost mantle. In this section we develop the correspond¬ 
ing theory for a spherical earth, which can be used for greater 
distances and thus greater depths. Application of these results, 
discussed in the next section, is our primary tool for studying 
the structure of the deep earth. 


p = l?Ei, (3) 

v 

where r is the radial distance from the center of the earth, v is 
the velocity at that point, and i is the incidence angle between 
the ray path and the radius vector. By reducing the thickness 
of the shells ever thinner, the velocity becomes a continuous 
function of radius, v(r). Equation 3 is thus Snell’s law for a 
spherical earth, which describes the ray path. As for the flat 
earth, the ray parameter is constant along the ray path, and 
thus identifies a particular ray. 

It may seem strange that different forms of the ray parameter 
and Snell’s law occur for a sphere. At any given depth, the flat 
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Fig. 3.4-2 Geometry of a ray path in a spherical earth with velocity 
increasing with depth. The angle of incidence, i, is 90° at the bottoming 
depth r p . & 



layer formulation, that p = sin ilv is constant, is valid. The 
factor r corrects for the change along the path of the orientation 
of the normal to the interface, which is the radius. If r changes 
so slowly along the path that its variation can be ignored, 
we obtain the flat case. Thus the flat layer version is used for 
iear-surface refraction and reflection studies. 

The condition of constant ray parameter relates the ray path 
° the velocity structure. For a source at a radius r 0 (the earth’s 
•adius for a surface source) where the velocity is v Q , 

7 = r 0 sin i/v Q . (4) 

lays leaving the source at different angles thus have different 
ay parameters. As the ray travels downward, r decreases, and 
a general v increases, so sin i and thus i increase, because p 
> constant. The ray eventually “bottoms” and turns upward 
/hen / = 90° (Fig. 3.4-2). At this bottoming depth, r = r and 


rom this point the ray returns to the surface. Different rays. 
r ith different p , thus bottom at different depths. 

Consider two rays with ray parameters p and p + dp , that 
rrive at nearby points on the earth’s surface (Fig. 3.4-3)’. The 
ty with ray parameter p takes a travel time T to travel a dis- 
.nce A, measured by the angle subtended at the earth’s center, 
hereas the ray with p + dp takes T+dT to travel A + dA. In the 
nit, as the distance between the two points goes to zero, 


Fig. 3.4-3 Two rays with infinitesimally different ray parameters 
illustrating the relationship p = dT/dA. 



Fig. 3.4-4 Variables defining ds, a portion of the ray path subtending an 
angle dO. 


Thus, as for the flat layer case (Section 3.3.1) the ray para¬ 
meter is the reciprocal of the apparent velocity along the 
surface, c x : 



Hence the ray parameter can be measured from the difference 
in arrival times at nearby stations. Conversely, the slope of a 
travel time curve T(A) is the ray parameter of the ray emerging 
at a distance A. 

Because the geometry is spherical, it is natural to describe the 
ray path in polar coordinates. Consider (Fig. 3.4-4) the point P 
on the ray path with polar coordinates (r, 9). A small portion 
of the ray path, ds, subtends an angle at the center of the earth 
d9, so 






(9) 


(ds) 2 = (dr) 2 + r 2 (d0) 2 and sinz = r — 

ds 

Substitution in Snell’s law (Eqn 3) gives 

r sin i r 2 dO 

v v ds 


( 10 ) 


We thus use Eqns 9 and 10 to form and equate two expressions 
for ( ds/dd ) 2 , 


4 f j V 
r* dr 9 

-= — + r l 

pv de 


( 11 ) 


material. As written, the integrals are from the surface to the 
bottoming depth, with the factor of 2 accounting for the 
return trip to the surface. If the source is not at the surface, 
the limits of integration are changed appropriately. 

For the flat geometry, we found it useful to describe the 
travel time curve in terms of its slope, the ray parameter, p , 
and the time axis intercept of its tangent, t (Section 3.3.2). To 
do the same for the spherical geometry, we write the travel time 
curve as 

T(p)=pA(p) + z(p). (17) 

We then evaluate the function 


and manipulate them to obtain 


z{p) = T{p)~-pA{p) 


( 18 ) 


r{? 2 -pT 2 

where f is defined by f - r/v. Integrating this expression from 
the source depth, which we assume to be the surface r 0 , to the 
deepest point on the ray r py and doubling to account for the 
upward path, gives 


A {p) = \dd = 2p 


r(C 2 - P 2 ) y: 


This integral gives the angular distance A traveled by the ray 
with ray parameter p in an earth with a velocity distribution 
v{r). 

A similar integral expression for the travel time of this ray 
comes from combining Eqns 9 and 10, 


so that a portion of the ray path is 


ds = ±- --- 

v(C--p 2 f‘ 


Thus the travel time, defined by the integral of the slowness 
along the ray path, is given by 



These integral expressions for the distance A (p) and travel 
time T(p) of a ray in a spherical geometry are analogous to 
those for x(p) (Eqn 3.3.25) and T(p) (Eqn 3.3.26) in a layered 


using the integral expressions (Eqns 13 and 16), and find that 



(19) 


This formulation can be used to invert travel time curves for 
velocity structure. 


3.4.2 Velocity distributions 

Different distributions of velocity with depth produce char¬ 
acteristic travel time curves. Figure 3.4-5 (overleaf) shows the 
usual situation in which velocity increases slowly with depth. 
Given two rays, the one with a smaller angle of incidence at the 
source has a smaller p , thus bottoms deeper at a point with 
smaller r p and larger v p , and eventually emerges further from 
the source. Thus the ray parameter decreases, and travel time 
increases, monotonically with distance, A. The travel time 
curve, T(A), is concave downward because its slope, p( A), de¬ 
creases with distance {dp/dA = d 2 T/dA 2 < 0). The intercept- 
slowness curve, t(£>), is smooth. To show these relations in 
different ways, the plots in Fig. 3.4-5 are aligned so that the 
distance axis is common to the ray path, T(A), and p{ A) plots, 
the depth axis is the same for the ray path and velocity-depth 
plots, and the time axis is the same for the T(A) and z{p) plots. 

A more complicated situation occurs when velocity increases 
rapidly with depth (Fig. 3.4-6). Rays that bottom either above 
or below the region of high velocity gradient behave as in 
Fig. 3.4-5, so the corresponding portions of the travel time and 
ray parameter curve show T increasing with A, and p decreas¬ 
ing with A. By contrast, rays that bottom in the region of high 
velocity gradient are bent upward more and emerge at smaller 
values of A than would otherwise be the case. As a result, three 
rays with different ray parameters emerge at the same distance 
A. Thus the p{ A) and T(A) curves have three distinct branches. 
On the two normal forward branches dp/dA < 0. However, on 


160 Seismology and Earth Structure 



Distance (A) 


7 ig* 3.4-5 Ray paths, T(A), p(A), and t {p) relationships for velocity 
ncreasing slowly with depth. 


he back branch, A decreases with decreasing p , so dpIdA > 0. 
Thus rays with smaller incidence angles arrive closer to the 
ource, giving a characteristic triplication in the travel time 
:urve and a reversal in the p(A) curve. We will see in the next 
ection that triplications are observed in the travel time curve 
or waves in the mantle, due to velocity increases that are 
hought to result from mineral phase transitions. 

A triplication is similar to the travel time curves for the 
direct, reflected, and head waves for a layer over a halfspace 
Fig. 3.2-2). The back branch of the triplication is analogous to 
he reflection, and the two forward branches are analogous to 
he direct and head waves. As the velocity increase becomes 
harper and more like the sharp jump between a layer and 
alfspace, the back branch extends further in either direction, 
o the triplication looks increasingly like the travel times for a 
tyer over a halfspace. 

As we discussed in Section 2.8.4, geometric ray theory gives 
iformation about amplitudes as well as travel times. Because 
re rays plotted left the source at uniform increments of angle, 
le amplitude expected at some distance depends on geometric 
Oreading , or the density of rays arriving. We expect high 
mplitudes where rays are concentrated, and low amplitudes 
/■here rays are sparse. Mathematically, the concentration of 


Wj 



Distance (A) 


Fig. 3.4-6 A triplication occurs if velocity increases rapidly, because at 
some distances three rays arrive. The triplication appears as the three 
branches in the T(A) and p( A) curves. The cusps on the travel time curve 
where the branches meet correspond to reversals in the p{A) plot. 

rays is proportional to di/dA , the range of incidence angles 
for the rays that arrive in a given distance. To find this, we 
differentiate the definition of the ray parameter (Eqn 7), 

d 2 T dp d(r sin Uv) r . di 

—r - — =---= —cos t —. (20) 

tfA 2 dA dA v dA 

Thus the amplitude is proportional to the second derivative 
of the travel time curve, or the derivative of the p(A) curve. 
For a triplication, the back branch meets the two forward 
branches at two points on the travel time and p( A) curves. Fiere 
dpIdA = °o, so large amplitudes are expected. This situation is 
called a caustic. 

A third important case is a low-velocity zone, where velo¬ 
city decreases with depth and then increases (Fig. 3.4-7). Rays 
entering the low-velocity zone bend down, rather than up, 
so no rays bottom there. To see this, note that for a ray to 
bottom, it must turn upward (to a larger angle of incidence) 
as it goes deeper (to smaller values of r), so that di/dr < 0. 
Conversely, if di/dr > 0, the ray turns downward and cannot 
bottom. These conditions can be written in terms of the 
velocity-depth function by differentiating both sides of 
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Fig. 3.4-7 A low-velocity zone gives rise to a shadow zone, a distance 
range where no direct geometric arrivals appear, and hence discontinuous 
T(A), p{ A), and r{p) curves. 


. . pv 
sin t- —, 


r 


( 21 ) 


yielding 


. di 

cos i — - 
dr 


1 dv v^ 


r 1 dv 1 

j dr r 1 j 

= sin i 

^ v dr r ^ 


and thus 


( 22 ) 


di . f 1 dv 

— = tan i - 

dr ^ dr 


(23) 


The condition that no rays bottom in a depth region where 
dildr is positive implies that the velocity decreases fast enough 
that 


dv v 
— > —. 
dr r 


(24) 


This situation causes a shadow zone , a region of the earth’s 
surface where no rays arrive. Just below the low-velocity zone, 
rays reach a given A by two paths, giving two values of p 


and travel time for a given distance. The back branch, with 
dpldA > 0, corresponds to the rays that would have bottomed at 
the depth of the low-velocity zone, had the velocity there been 
high enough. The forward branch, which continues to greater 
distances, corresponds to the rays that bottom deeper. The 
concentration of rays just past the shadow zone corresponds to 
the point where the two branches meet. Here dpldA = so 
large amplitudes occur. We will see that this situation occurs as 
a result of the drop in velocity across the core-mantle bound¬ 
ary, which gives rise to a shadow zone. 

3A.3 Travel time curve inversion 

To infer the distribution of velocity with depth, travel time 
curves are compiled from seismograms recorded at different 
source-receiver distances. The inverse problem of deriving 
velocity structure from the T (A) curves can be done in various 
ways. One is to use a computer program, based on Snell’s law, 
to trace rays through different velocity structures and compute 
the corresponding travel time curves. Figures 3.4-5-7 were 
derived this way. This approach solves the inverse problem 
by solving the forward problem repeatedly until a satisfactory 
solution is found. An alternative is to solve the inverse problem 
directly by deriving v(r) from T(A). 

Various methods have been used to solve the inverse 
problem. A classic one is the Herglotz-Wiechert integral This 
approach is based on Eqn 13, which gives the distance traveled 
by a ray with ray parameter p as a function of the velocity 
structure 


A (p) = 2 p 


dr 


r(C 2 ~ P 2 ) 112 * 


(25) 


r p 

where f = r/n, and p is the ray parameter for the ray arriving at 
A. This can be converted to 


cosh' 


r , . 
P( A) 


Ci 


( \ 


dA = it In 


(26) 


where ^ at radius r 1? the bottoming point of the ray 

that emerges at A v l This formula is used by starting with an ob¬ 
served travel time curve, T(A), and forming its derivative dTIdA 
= p( A) numerically. The integral is done numerically from A = 0 
to A = A 1? using the fact that = dTIdA at a distance A v The 
equation then gives the radius, r l5 at which the velocity is r x ! 

This method sometimes fails when velocity decreases with 
depth, giving a low-velocity zone. In some such cases, it can still 
be applied using a method called “earth stripping.” To do this, 
v(r) is found down to the low-velocity zone using the Herglotz- 
Wiechert integral. Equations 13 and 16 are then used with r\ 


1 Bullen and Bolt (1985}. 
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the outer radius of the low-velocity zone, substituted for r pi to 
find the distance and time the ray traveled on its way down to 
the low-velocity zone. Subtracting these from the known T(A) 
curve gives a T'(A) curve for a “mini-earth” with radius r'. 
Because in this “mini-earth” the velocity increases with depth, 
the Herglotz-Wiechert method is applied again. 

3.5 Body wave travel time studies 

We saw in the last section that travel time data can be used 
to determine seismic velocity as a function of depth. Begin¬ 
ning early in the 1900s, travel time tables were compiled by 
combining data from many earthquakes observed at various 
cpicentral distances. These seismological observations provide 
the primary data for our view of the basic features of the earth’s 
velocity structure. This picture, an essentially layered earth 
composed of a thin crust, a mantle, a liquid outer core, and 
i solid inner core, is key to our thinking about how the earth 
evolved and operates. This concept was largely developed by 
:he 1940s, as illustrated in Fig. 3.5-1, showing the classic 
[effreys-Bullen 1 (JB) earth model. The JB model treated the 
earth as a series of shells, characterized by the behavior of the 
/elocity with depth (Table 5.1-1). The mantle was divided 
nto an upper mantle (region B) and a lower mantle (region D), 



Depth (km) 

’ig. 3.5-1 Comparison of the classic Jeffreys-Bullen earth model (Jeffreys 
nd Bullen, 1940) and a newer model, IASP91 (Kennett and Engdahl, 

991). Although IASP91 and its successor, AK135 (Kennett etaL, 1995), 
iave improved resolution in the mantle transition zone and the core, the 
iewer models are generally similar to that derived using hand-cranked 
alculators. 

The model was derived from extensive joint research into earth structure by 
ir Harold Jeffreys (1891-1989), who established in 1926 that the core was liquid, 
nd Keith Bullen (1906-76). 


-k-/ \/V ' 

Table 3.5-1 Regions i: 

n Jeffreys-Bullen earth model. 

Region 

Depth 

(km) 

Features of region 

A 

33 

Crustal layers 

B 

413 

Upper mantle: steady positive P and 5 velocity 
gradients 

C 

984 

Mantle transition region 

D 

2898 

Lower mantle: steady positive P and 5 velocity 
gradients 

E 

4982 

Outer core: steady positive P velocity gradient 

F 

5121 

Core transition: negative P velocity gradient 

G 

6371 

Inner core: small positive P velocity gradient 


Source : After Bullen and Bolt (1985). 


both of which had smooth velocity gradients. These upper and 
lower mantle regions were separated by region C, the mantle 
transition zone where velocities increase rapidly with depth. 
Below the core-mantle boundary (CMB), the core was divided 
into an outer core (region E) and an inner core (region G), 
separated by a transition zone (region F). The inner core 
boundary (ICB) separated regions F and G. Subsequently, 
the lower mantle was divided into regions D' (1000-2700 km 
depth), most of the lower mantle with a smooth velocity gradi¬ 
ent, and D" (2700-2900 km), the zone above the core-mantle 
boundary with a reduced velocity gradient. 

Subsequent studies have derived models, such as the IASP91 
model, also shown in Fig. 3.5-1, which confirm the basic struc¬ 
ture of the JB model and provide better resolution of import¬ 
ant regions. For example, the JB model did not resolve shear 
velocities in the inner core, whereas recent models have finite S 
velocity in the inner core, implying that it is solid. Similarly, 
recent models provide more details about the mantle transition 
zone and the core-mantle boundary, and do not include the 
velocity “notch” at the inner core-outer core boundary. 

Jeffreys’ and Bullen’s derivation of a radially symmetric 
earth model from travel time observations converted the previ¬ 
ous crude picture of the earth into one that has since changed 
only in detail. More recent radial velocity models do not differ 
much from each other, so they are likely to be converging on an 
accurate radial model for the earth. Such average, or reference , 
models and travel time curves, such as JB, IASP91, and PREM 
(for Preliminary Reference Earth Model, Section 3.8), are 
derived from data around the world and so average over local 
variations in structure. Regional differences can then be viewed 
as perturbations relative to a reference model. 

However, lateral differences in structure can be significant 
and provide insight into tectonic processes. Thus a major 
current goal of seismology is to define the three-dimensional 
velocity structure that results from the fact that the earth is a 








3.5 Body wave travel time studies 163 




Fig. 3.5-2 Top : Long-period vertical component seismogram at Golden, 
Colorado, showing various seismic phases. Bottom : Ray paths for some 
of the seismic phases labeled on the seismogram. Paths taken as P waves 
are shown as solid lines; paths taken as S waves are shown as dashed lines. 
Although P and S are both direct phases, they do not travel the exact same 
path because their velocities differ. Similarly, the ray path for PcS is 
asymmetric, and pP and sP do not reflect off the surface at the same 
location. 

geologically active planet. Convection in the earth causes three- 
dimensional temperature variations that result in observable 
velocity variations. In addition, mantle flow appears to gener¬ 
ate seismic anisotropy at the top and the bottom of the mantle, 
and magnetic stresses due to outer core flow may cause inner 
core anisotropy. Resolving this three-dimensional structure 
requires sophisticated analysis techniques. For example, travel 
time studies are complemented by waveform modeling, and 
stacking techniques are applied to enhance seismic signals. The 
suggested reading provides some reviews of recent studies. 

This section focuses on determining velocity structure, so 
we largely defer discussion of the chemical, mineralogical, 
thermal, and rheological factors that cause these variations for 
later sections. 

3.5 .1 Body wave phases 

We have seen that seismic waves can travel between a source 
and a receiver along multiple paths. For example, increases in 
velocity can cause triplications, yielding three distinct arrivals 



Fig. 3.5-3 Travel time data for various body wave phases and travel time 
curves for model IASP91. The travel times are corrected to those for an 
earthquake at the surface. The data are 57,655 travel times from 104 
sources (earthquakes and explosions). (Kennett and Engdahl, 1991.) 


at a receiver. Multiple reflections off various layers and diffrac¬ 
tions can bring additional arrivals. Hence seismograms contain 
many arrivals, or phases , corresponding to different travel 
paths. This is illustrated by Fig. 3.5-2, discussed in Section 1.1, 
showing a few of the phases that are observed and some of 
the corresponding ray paths. All the phases shown, except 
for the Rayleigh surface wave, are body waves that travel 
through the earth’s interior. 

Such seismograms provide the observations that are com¬ 
bined to generate travel time tables. Figure 3.5-3 illustrates the 
process; the dots are travel times observed at various epicentral 
distances for a set of earthquakes and nuclear explosions. The 
data define lines giving the travel times of different phases. Such 
observations can be used to develop and test earth models 
giving P and S velocities as a function of depth. These models 
predict the observed travel times quite well, as shown by the 
fit of the theoretical travel times (lines in Fig. 3.5-3) to the 
observations. The travel times depend on the source depth, as 
shown in Fig. 3.5-4 for a surface source and a source at 600 km 
depth. 

Although the details of an earth model depend on the specific 
data used to construct it, the key features of IASP91 are char¬ 
acteristic of recent models. The model represents a global 
average of the velocity structure that varies somewhat between 
locations. The crust is 35 km thick, an average between thin 
oceanic and thick continental crust (Fig. 3.2-17). Velocities 
increase smoothly through the upper mantle, to a depth of 
410 km. The mantle transition zone, from about 400-700 km 
depth, contains depth intervals near 410 km and 660 km 
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IASP91: 0 km source IASP91: 600 km source 
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Fig. 3.5-4 Travel time curves for body wave 
phases for model IASP91 computed for an 
earthquake at the surface {left) and at a 
focal depth of 600 km {right). (Kennett and 
Engdahl, 1991.) 


ere the velocities increase rapidly. 2 Although these regions 
often referred to as the 410 km and 660 km discontinuities, 
ir exact depths vary from place to place. From about 700 
1890 km depth the velocities increase smoothly throughout 
lower mantle. At about 2890 km, the P velocity drops 
.rply, and the S velocity goes to zero, corresponding to the 
rid outer core. The outer core extends to a depth of about 
>0 km, beneath which the solid inner core has higher velocit- 
including a finite iS-wave velocity. As we will see, these vari- 
uns in velocity with depth are thought to reflect important 
mges in the physical, chemical, thermal, and mineralogical 
te of the materials present. 

leismic phases are named, based on their paths through the 
th (Fig. 3.5-5, Table 3.5-2). The direct P-wave and S-wave 
ivals are denoted “P” and “5.” Another class of arrivals 
olve reflections at the earth’s surface. The P-wave arrival 
responding to a single surface reflection is called PP, that 
two reflections is PPP, and so on. Similarly, SS and SSS 
respond to S waves reflected at the surface. Because P waves 
l convert to S waves, and vice versa, PS is a P wave converted 
in S wave upon surface reflection, and SP is the reverse, exon¬ 
eration of the ray paths shows that the travel time for PP at a 
sn distance should be twice the travel time of P at half that 
:ance — that is, to a point midway between the source and 

Jthough early estimates put the locations of the discontinuities at depths of 400 
670 km, recent revisions place them closer to 410 and 660 km. We will use the 
ring values interchangeably. 


Table 3.5-2 Body wave phase nomenclature. 

Name Description 

P Compressional wave 

S Shear wave 

K P wave through outer core 

I P wave through inner core 

J S wave through inner core 

PP P wave reflected at surface 

PPP P wave reflected at surface twice 

SP S wave reflected at surface as P wave 

PS P wave reflected at surface as 5 wave 

pP P wave upgoing from focus, reflected at surface 

sP S wave upgoing from focus, converted to P at 

surface 

c Wave reflected at core-mantle boundary (e.g., ScS) 

/ Wave reflected at inner core-outer core boundary 

(e.g., PKiKP) 

P' Abbreviation for PKP 

P d or P djff P wave diffracted along core-mantle boundary 

Source: After Bolt (1982). 

the receiver. Similarly, the travel time for PPP should be three 
times the travel time for one-third the distance. 

The surface-reflected phases PP and SS (as well as SSS , SSSS 
or 54, etc.) have unusual characteristics. By Fermat’s principle 
(Section 2.5.9), seismic phases have either minimum or maxi¬ 
mum travel times with respect to adjacent paths. Most arrivals 
(P, 5, pP, ScS, etc.) are minimum-time phases, but the surface 
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Fig. 3.5-5 Examples of body wave phases illustrating the nomenclature 
used. “P” and “S” designate direct ray paths, whereas “p” and “s” denote 
upgoing paths from the earthquake. Hence SP designates an S wave 
through the mantle reflected at the surface as P. “c” designates a reflection 
at the core-mantle boundary, so PcP is a P wave reflected at the core, and 
PcS is a P wave reflected as 5. “K n and “I” denote P waves that traveled 
through the outer and inner cores, and “z” designates a reflection at the 
inner core’s boundary. Hence PKIKP travels through the mantle, outer 
core, and inner core. PKJKP , which travels as S through the inner core, has 
only recently been conclusively observed. (After Bolt, 1982. From Inside 
the Earth by Bruce A. Bolt. © 1982 by W. H. Freeman and Company. 

Used with permission.) 


reflections are maximum-time phases with respect to distance. 
To see this, consider ray paths for a surface reflection that differ 
slightly from the true path, so the reflection bounces off the sur¬ 
face a small distance £ from the actual bounce point at A/2, 
halfway between the source and the receiver (Fig. 3.5-6, top). 
Their travel time is thus the sum of the travel times for two legs 

T( A) = T( A/2 + e) + T( A/2 - e ). (1) 

Using the first two terms of the Taylor series 

dT e 2 d 2 T 
T(All + £)~ T(A/2) + e — + — —, 
dA 2 dA L 

T(A/2-e)«T(A/2)-e^2 + TTl (2) 

dA 2 dA 2 

shows that 

d 2 T 

T(A) = 2T(A/2) + e -—. (3) 

dA 2 



Flat earth: 

^-This path is longer than PP 


Elliptical earth 

— This path is same 
length as PP 

— All reflected 
energy arrives 
at same time 




Spherical earth: / 

— This path is shorter than PP 

— PP is a "maximum time" phase 


Fig. 3.5-6 Top : Ray path for a surface reflection. The reflection is a 
maximum-time phase, because the travel time for reflection at the 
midpoint A/2 is longer than on nearby alternative paths. Bottom-. Ray 
paths for a surface reflection in a homogeneous medium, in which all 
reflections off the elliptical surface have the same travel time. The 
reflection off the midpoint is a minimum-time phase if the surface is flat, 
and a maximum-time phase if the surface is circular. 


By Fermat’s principle, the true ray path is that on which the 
derivative of travel time with respect to £ is zero, 


dT . d 2 T 

- = l£ - 

d£ dA 2 


= 0 , 


(4) 


so £ is zero, giving the expected bounce point. To see if this is a 
minimum or a maximum, we form the second derivative 


d 2 T =1 d 2 T 
d£ 2 dA 1 


(5) 


Figure 3.5-4 and Section 3.4.2 show that the direct P and S 
waves have travel time curves that are concave down, d 2 T/dA 2 
< 0, so their surface reflections PP and SS are maximum-time 
phases. Thus PP or SS waves traveling along the same azimuth 
that reflect at the surface either closer or further than the point 
where PP reflects arrive earlier. By contrast, the core reflections 
like ScS have travel time curves that are concave upward, so in 
Eqn 5 d 2 T/dA 2 > 0, and its surface reflection ScS2 is a minimum¬ 
time phase. 
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An intuitive way to view minimum- and maximum-time 
phases is to consider ray paths for surface reflections in a 
homogeneous medium (Fig. 3.5-6, bottom ). An ellipse defines 
the set of points whose summed distances to two points, or 
foci, are equal. Thus, if an earthquake and a receiver were the 
foci, the travel time for a reflection from any point on the ellipse 
would be the same. Hence, if the surface were elliptical, the 
reflected phase would be neither a minimum- nor a maximum¬ 
time phase, because all the energy would arrive at the same 
time. If the surface were flat, and thus had less curvature than 
the ellipse, waves that reflect off the surface slightly closer or 
further than the midpoint travel further, making the reflection 
a minimum-time phase. However, if the surface were circular 
and more curved than the ellipse, waves reflected off the surface 
slightly closer or further than the midpoint travel a shorter dis¬ 
tance, making the reflection a maximum-time phase. This last 
case is analogous to that for PP and SS in the spherical earth. 

Although PP and SS are maximum time phases with respect 
to distance, they are minimum travel time phases with respect 
to azimuth, as are most phases. Thus waves with a bounce 
point off the great circle path between the source and the 
receiver arrive later. This combination of maximum time with 
respect to distance and minimum time with respect to azimuth 
makes the surface reflections sample an “X’’-shaped region of 
the surface, known as the Fresnel zone (Section 3.7.3), near the 
bounce point. The fact that these are maximum-time phases 
also causes them to undergo a nil phase shift 3 (Fig. 2.6-5). 
Each successive bounce at the surface causes another ni 2 phase 
shift, so SSS is phase-shifted by n and inverted with respect to 
direct S. 54 undergoes a 3 tt/ 4 phase shift, and 55 has a In shift, 
giving it the same shape as the original 5. 

Figure 3.5-5 is drawn for an earthquake beneath the earth’s 
surface. Because earthquakes occur to depths of 700 km, seis¬ 
mic ray paths go up from earthquakes as well as down. Lower¬ 
case “p” and “s” identify upgoing compressional and shear 
waves (Fig. 3.5-2). pP goes up as a P wave and reflects near 
the epicenter, whereas sP goes up as an 5 wave and converts to 
a P wave at the surface. These reflections are useful because 
the travel time difference between direct P and pP, for example, 
indicates the depth of the earthquake. After an upgoing wave 
reflects at the free surface, it can undergo later conversions, 
50 pPP, sPS, etc. are possible arrivals. 

Many other body wave phases have been identified and are 
included in travel time tables. In addition, some tables give 
arrival times for Love and Rayleigh surface waves. As shown in 
Fig. 2.7-4, these surface waves are dispersive, so different fre¬ 
quencies have different arrival times, making the time shown 
approximate. This time is still useful for various purposes, 
ncluding allowing earth structure studies to avoid phases 
hat may be obscured by surface waves. In many cases, deep 


This phase shift, also known as a Hilbert transform, can be viewed by thinking of 
he pulse as made up of sine and cosine functions and turning each into the other. 


earthquakes are used for body wave studies, because they 
generate only small surface waves. 

Linally, it is worth remembering that travel time tables are 
compiled from observations of seismic arrivals. Although most 
arrivals on seismograms can be identified today from exist¬ 
ing tables, important results are still found by noticing and 
explaining a previously unrecognized arrival. 

3.5.2 Core phases 

The contrast in properties between the solid mantle and the 
liquid core, which has lower velocity than the mantle above, 
makes the core well suited to seismological study using reflected, 
transmitted, converted, and diffracted arrivals. 

Core reflections are of great interest because the core-mantle 
boundary (CMB) is a solid-liquid boundary, and thus a strong 
reflector for shear waves. Reflections off the CMB are denoted 
by a lower-case “c,” so ScS is an 5-wave reflection and PcP is 
a P-wave reflection. Conversions at the CMB also occur. ScP 
goes down through the mantle as a shear wave and returns as 
a compressional wave, whereas PcS does the reverse. Some 
phases undergo multiple reflections at both the core and the 
surface; ScSScS (or ScS2) bounces twice at the CMB and once at 
the surface. Such reflections, known as multiple ScS, are shown 
in Fig. 1.1-4. 

ScS is a more distinct arrival than PcP, because the liquid 
core does not transmit shear waves. The SH part of the motion 
in the incident ScS cannot convert to P waves at the CMB, so is 
totally reflected. Hence ScS is often well recorded on the trans¬ 
verse component (Section 2.4.4) of a seismometer. By contrast, 
the PcP reflection is generally weak, because the impedance 
contrast (Section 2.6.6) is small, so most P energy incident 
on the CMB is transmitted. The small Impedance contrast 
(about 5%) arises because the P-wave velocity decrease going 
from the mantle to the core (about 13.7 km/s to 8.1 km/s) is 
offset by the density increase (about 5.5 g/cm 3 to 9.9 g/cm 3 ). 

Core reflections, especially ScS, are useful in studies of earth 
structure, because they give a vertical average velocity for the 
mantle. The travel time curves for these phases are concave 
upward (Fig. 3.5-4), like that for the reflection off the top of a 
layer in a flat geometry (Section 3.2.1). Similarly, they have 
finite travel time at zero distance because of the time needed to 
get down to the core and back. 

The travel times and amplitudes of core phases are also used 
to study structure near and within the core, because their ray 
paths are sensitive to the structure. To illustrate this idea, con¬ 
sider P-wave ray paths (Fig. 3.5-7, top left) within the earth. 
Rays leaving the source at progressively smaller angles of 
incidence (closer to the vertical) bottom deeper in the mantle 
and so reach greater distances. As the bottoming depth ap¬ 
proaches the core-mantle boundary, the travel times of P 
and PcP converge (Figs 3.5-3 and 4). Eventually, at about 98° 
(the precise distance depends on the depth of the earthquake 
and the exact velocity structure), P grazes the core-mantle 
boundary, and P and PcP are identical. 
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Fig. 3.5-7 Ray paths and travel times for 
major core phases, computed for earth 
model PREM. Top left : Paths for direct 
rays (i.e., excluding reflections and 
diffractions). Right: Ray paths for four 
other phases: PKP passes through the outer 
core, PK1KP also penetrates the inner core, 
PKiKP reflects from the boundary between 
the outer and inner cores, and P d (also 
called P di ff) diffracts along the core-mantle 
boundary. Lower left: Travel time curves 
for these phases. Points on the earth’s 
surface are labeled with their distances in 
degrees. 



PKIKP 



F(180° 


PKiKP 



P d (98°) 



A ray with a slightly smaller angle of incidence, however, 
refracts downward at the CMB, because the core has a lower 
P velocity than the mantle. It thus enters the core, travels 
through it, refracts into the mantle, and reaches the surface. 
This phase is called PKP, where “A” denotes passage through 
the outer core. 4 For an angle of incidence slightly below 
grazing, PKP reaches the surface at point A (Fig. 3.5-7, top 
right), at a distance close to 180°. Rays with smaller angles 
of incidence penetrate deeper into the core, and thus arrive 
at distances successively less than 180°, down to a distance of 
about 145° (point B). At this point the pattern reverses, because 
rays with smaller angles of incidence arrive at successively 
greater distances. This goes on for rays reaching distances up to 
point C (-153°, depending on the earth model), corresponding 
to the ray that grazes the inner core-outer core boundary. 

The ray paths show that the low velocity in the outer core 
gives rise to a geometrical shadow zone, where Snell’s law 

4 “ JC” is from Kern, the German word for core. 


predicts that no direct rays arrive. 5 We have seen (Fig. 3.4-7) 
that the corresponding travel time curve should have a break 
due to the shadow zone, and then two branches on the far 
side of the shadow zone. For the core, the shadow zone occurs 
for distances between -98° to -145° (point B, Fig. 3.5-7, top 
left). Beyond 145°, the travel time curve has two branches for 
PKP. The AB branch (sometimes labeled PKP 2 ) is the back 
branch, on which rays with smaller angles of incidence appear 
at smaller distances, whereas the BC branch is the forward 
branch on which rays with smaller angles of incidence appear 
at larger distances. 

In reality, body waves are observed in the shadow zone. 
Much of the body wave energy arrives as surface-reflected ( PP, 
PPP, SS, etc.) or multiply core-reflected (ScS2, etc.) arrivals. 
Other arrivals are due to P waves that encounter the inner core. 
Because the inner core has higher P-wave velocity than the 

5 Although the core’s existence had been inferred from the earth’s gravity (Sec¬ 
tion 3.8), the discovery of this shadow zone in 1906 by Richard Oldham (1858-1936) 
provided the first direct evidence and set the paradigm for future core studies. 
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outer core, waves refract upward and emerge in the shadow 
zone. These phases are known as PKIKP, because P waves in 
the inner core are denoted by “I.” In addition, waves reflect 
at the boundary between the inner and outer cores, giving the 
phase PKiKP. 6 (The lower-case “z” is analogous to the lower¬ 
case “c” in PcP.) The travel time curve thus has a PKIKP 
branch DF, where D is the distance at which PKIKP is first 
observed, and a back branch for PKiKP. The back branch 
begins at C, where PKiKP and PKP are the same, and extends 
through D back to zero distance (Figs 3.5-3 and 4), because 
the reflection occurs at vertical incidence. Hence the portion of 
the travel time curve containing CD and DF is due to the rapid 
increase of velocity at the inner core-outer core boundary, and 
is analogous to a triplication. 

Seismic energy also enters the shadow zone via P and S waves 
that diffract around the core (Section 2.5.10). The ray paths for 
the diffracted P waves (denoted P d or P dj A shown in Fig. 3.5-7 
represent energy that diffracted around the core, left the CMB, 
and traveled back to the surface. This process is much like that 
discussed for the head wave (Section 3.2.1). Thus, once the 
direct P wave becomes the diffracted wave at a distance near 
100°, its travel time curve (Fig. 3.5-4) loses the curvature it had, 
aecause successive rays penetrated deeper to higher-velocity 
material. Instead, it becomes linear because all the diffracted 
vaves bottom at the CMB, and so have the same ray parameter 
md hence apparent velocity. As for the head wave, assuming 
hat the energy followed a ray path gives the diffracted wave’s 
ravel time but cannot fully describe its amplitude, because 
liffraction involves energy propagating as waves, not rays. 
Towever, we will see that more complete formulations such as 
lormal modes predict both the times and the amplitudes of the 
liffracted phases. 

Figure 3.5-7 shows that the travel time curve for the core 
>hases is complicated because it combines the effects of a 
jeometric shadow zone, which gives two PKP branches, a 
riplication-like feature containing the PKIKP and PKiKP 
ranches, and a diffraction branch. In reality, even these models 
re simplifications of a more complex reality. Figure 3.5-8, 
howing the travel times of several million PKP arrivals, illus- 
tates several significant deviations from the theoretical curves 
i Fig. 3.5-7. First, the arrivals do not fall along narrow lines, 
"his is partly due to errors of observation, but also due to the 
eterogeneous structure of the crust, mantle, and core, which 
lakes some arrivals early and others late. Second, the PKP-BC 
ranch continues beyond its geometrically predicted limit of 
53°. This is because the PKP-BC wave diffracts around the 
mer core, although its amplitude decreases rapidly in the 
rocess, so there are few observations beyond 160°. 

Third, and most importantly, the PKP travel times show an 
dditional branch not predicted by geometric ray theory. These 
rrivals, labeled PKP precursors , appear to be a continuation 
f the PKP-AB branch and arrive as much as 20 s before the 

Observations of this phase by Inge Lehmann (1888-1993) in 1936 provided the 
st evidence for the existence of the inner core. 



Fig. 3.5-8 Arrival times of PKP waves recorded by the International 
Seismological Centre during 1964-87. A point is plotted if there are at 
least 200 arrivals in the catalog for that time and distance. Although these 
arrival times are similar to the predicted travel time curves in Fig. 3.5-7, 
there are some differences. The PKP-BC branch is observed beyond its 
geometrical limit (153°) due to diffraction around the inner core, and 
precursors to the PKP-DF branch are observed that result from seismic 
scattering at the CMB and in the mantle. (Courtesy of K. Koper.) 

PKP-DF branch. These arrivals puzzled seismologists until it 
was realized that they were waves reflected, or scattered, from 
inhomogeneous structures in the mantle. This scattering is 
analogous to that discussed in Section 3.3.7 in the context of 
migration in reflection seismology. Because the scatterers are 
comparable in size (about 10-15 km) to the wavelengths of 
short-period P waves in the lower mantle, they behave as 
Huygens’ sources (Section 2.5.10). Thus a PKP-AB wave 
interacting with a scatterer at the CMB radiates waves in all 
directions (Fig. 3.5-9). Those arriving before PKP-DF are 
clearly observed, whereas those arriving afterwards are lost 
amid PKP-DF. The range of observable scattered PKP waves is 
shown as the shaded regions in Fig. 3.5-9, illustrating another 
way in which seismic energy reaches the shadow zone. Although 
most such scattering occurs near the CMB, modeling of the 
PKP precursors suggest that waves are also scattered by small 
reflectors throughout the mantle, as shown by the dark shaded 
region in Fig. 3.5-9. 

Some core phases begin as S waves (Fig. 3.5-5). Although no 
5 waves propagate in the liquid outer core, phases like SKS 
travel through the mantle as an S wave and through the core 
as a P wave. SKKS is similar to SKS , but also involves an 
underside reflection at the CMB. Because the P velocity of the 
uppermost core (about 8.1 km/s) is not much larger than the S 
velocity of the lowermost mantle (about 7.2 km/s), SKS and 
SKKS waves do not change direction significantly as they cross 
the CMB. Thus SKS , SKKS , SKKKS , etc. are the only waves 
that bottom near the top of the core and are used to constrain 
the outer core’s velocity structure. 
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Fig. 3.5-9 A model for the PKP precursors shown in Fig. 3.5-8. Top : PKP- 
AB waves that interact with scatterers (stars) cause arrivals at distances 
less than the geometrically allowed AB range. As shown by the travel times 
(bottom), these arrivals precede the PKP-DF arrivals. Scatterers at the base 
of the mantle yield waves in the light and dark shaded regions, and others 
in the mid-mantle yield waves in the dark shaded region. (Ffedlin et al ., 
1977. Reproduced with permission from Nature.) 


Other core phases, some of which are not included in the 
travel time plots in Fig. 3.5-4, have also been reported. These 
include PKKP (Fig. 3.5-10), a P wave that has undergone under¬ 
side reflection at the CMB, PKPPKP (sometimes called P'P'), 
a PKP phase reflected at the surface, and PKIIKP , an under¬ 
side reflection from the outer core-inner core boundary. An 
especially elusive phase has been PKJKP , which, by analogy to 
PKIKP, travels through the inner core as an S wave. The weak 
amplitude of this phase, combined with the fact that it arrives 
late in the seismogram amid other phases, has made it difficult 
to observe. PKJKP has been verified only recently, by stacking 
data from very large deep earthquakes that generate the large 
body waves needed to produce even small PKJKP , while not 
generating surface waves that mask the small core arrivals. 7 

7 This observation and normal mode results are overcoming seismologists’ prior 
reservations, exemplified by comments like “the inner core, which may exist, is said 
to have the following properties. ...” 



Fig. 3.5-10 Some additional core phases. PKKP and PKIIKP are 
underside reflections at the core-mantle boundary and outer core-inner 
core boundaries, and PKPPKP (P'P') is an underside surface reflection. 

Core phases can be challenging to study with travel time data 
because their travel time curves are complicated and some 
of the arrivals are small. Amplitude and waveform studies 
provide additional information. As we have seen (Section 3.2.3), 
amplitudes can be used to differentiate between structures that 
would give similar travel times. Some insight into the ampli¬ 
tudes can be obtained from the ray densities (Section 3.4.2). 
For example, the AB and BC branches of the PKP travel time 
curve meet at the far side of the shadow zone, at point B. 
Figure 3.5-7 shows that rays which left the source at uniform 
angle increments are concentrated there, so large amplitudes 
are expected at this caustic. 

This discussion of amplitudes brings out another interesting 
point. Although the earth is approximately spherical, we have 
discussed only waves propagating in the plane containing the 
source, the receiver, and the center of the earth. One case in 
which sphericity is important is near the antipode, the point 
180° from the source. Figure 3.5-11 shows seismograms re¬ 
corded at PTO (Porto, Portugal) and MAL (Malaga, Spain) 
from an earthquake in New Zealand. Phases like PP and PKP 
are focused at the antipode, because paths in any direction 
from the source arrive at the same time. Note the larger arrivals 
at PTO, only 0.7° from the antipode. 

3.5.3 Upper mantle structure 

The velocity structure of the upper mantle shows two major 
effects. First, it has discontinuities and velocity gradients that 
are essentially radially symmetric, which are believed due to the 
effects of pressure on the minerals present. Second, it contains 
significant lateral heterogeneity that is primarily associated 
with temperature variations due to cold subducting oceanic 
lithosphere. We discuss the radial velocity structure here, 
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Antipodal focusing 



Fig* 3 * 5 “ 11 Focusing of P waves at the antipode, 180° from an earthquake. Left : Seismic rays PKP-AB and PP focus at the antipode. Dashed lines 
represent wave fronts, whose propagation time in minutes is given. Right: Seismograms showing antipodal focusing of core phases. (Rial and Cormier, 
1980. /. Geophys. Res., 85, 2661-8, copyright by the American Geophysical Union.) 


explore its mineralogical causes in Section 3.8, and consider the 
effects of subducting lithosphere in Section 5.4. 

We have already discussed the velocity structure of the upper¬ 
most mantle shown by surface wave dispersion (Section 2.8). 
Body wave analyses reveal a similar structure. The sub-crustal 
ithosphere shows generally fast P- and 5-wave velocities of 
ibout 8.1 and 4.5 km/s. This high-velocity layer provides a 
vay of defining the lithosphere, termed the seismic lithosphere 
Dr lid , from seismological observations. The thickness of the 
seismic lithosphere varies with location. At mid-ocean ridges, 
vhere oceanic plates are created, its thickness approaches zero, 
beneath stable cratons, the fast lithospheric velocities extend 
o about 200 km. As a global average, the seismic lithosphere 
:xtends to about 80-100 km depth. 

In most regions of the world, we find a seismic low-velocity 
;one (LVZ) beneath the seismic lithosphere. The LVZ appro¬ 
ximately coincides with the expected mechanically weak 
isthenosphere underlying the stronger lithosphere. The lith- 
>sphere and asthenosphere are defined by their mechanical 
properties, such that plates of strong lithosphere slide over 
weaker asthenosphere. This contrast, as we will see, results 
rom the fact that the lithosphere is the cold outer thermal 
•oundary layer of the solid earth (Sections 3.8, 5.1). By con- 
tast, the high-velocity seismic lithosphere and underlying LVZ 
re seismologically defined entities. The rough correspondence 
etween the seismological and mechanical layers indicates that 
5e two are closely related, and that seismic observations can be 
sed to map mechanical structure. The two sets of layers are 
ot identical for several reasons including the fact that seismic 


waves sample physical properties over a period of seconds, 
whereas the lithosphere and asthenosphere are inferred from 
data sampling periods of thousands and millions of years 
(Section 5.7). 

The depth and magnitude of the LVZ vary regionally. In 
tectonically active regions like western North America, the 
LVZ is well developed and relatively shallow. In stable con¬ 
tinental regions that have not experienced tectonism for a long 
time, the LVZ is deeper and less pronounced, and may not even 
be present. The thick, high-velocity layer under continents has 
led to the suggestion that it may reflect a chemically distinct 
tectosphere . On this hypothesis, continents behave differently 
from the oceanic lithosphere, where surface wave dispersion 
shows a pronounced LVZ for all ages (Fig. 2.8-7). This persist¬ 
ence may reflect the fact that oceanic lithosphere is never older 
than 180 Ma, tectonically young by continental standards, 
because older oceanic lithosphere is subducted away. 

We will show in Section 5.7 that the contrast between the 
high-velocity seismic lithosphere and the asthenosphere LVZ 
is probably related to variations in material strength between 
the cold lithosphere and the warmer asthenosphere. There may 
also be some effects of partial melting. This situation differs 
from the velocity differences between the crust and the mantle, 
which result from their different compositions. Beneath the 
LVZ, which extends to an average depth of about 200 km, 
temperatures increase only slowly, but velocities increase signi¬ 
ficantly in response to the increasing pressure. 

The transition zone between the upper and lower mantles is 
marked by the velocity discontinuities at depths of about 410 
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Fig. 3.5-12 Ray paths for P waves through the upper mantle, computed for earth model PREM, showing triplications due to mantle discontinuities. 


and 660 km. We saw in the last section that a rapid velocity 
increase (Fig. 3.4-6) produces a triplication in the travel time 
curve. Upper mantle travel times show two triplications around 
15° and 22° caused by the 410 and 660 km discontinuities. 
Ray paths for such a structure are shown in Fig. 3.5-12. Some 
reference models such as PREM also have a discontinuity at 
220 km, and regional studies also often find discontinuities at 
other depths in the upper mantle. 

One difficulty in studying the transition zone, or other re¬ 
gions of complex velocity structure, is that travel time curves 
are composites of data from many earthquakes at different dis¬ 
tances. The process of combining the data can make the details 
of the triplication difficult to observe. Moreover, dTIdA , the 
derivative of the travel time curve that is used in inverting for 
velocity, is uncertain due to the scattered data. These difficult¬ 
ies can be addressed in several ways. One is to derive informa¬ 
tion from the waveforms as well as the travel times. A second is 
to use arrays of seismometers spaced closely enough that it is 
possible to identify arrivals corresponding to the different 
branches of triplications and directly measure dTIdA by tracing 
them across the array. Such dense data also facilitate waveform 
studies. 

Figure 3.5-13 (overleaf) illustrates these ideas with an array 
study of upper mantle P-wave structure under the Gulf of 
California spreading center. Data from ten earthquakes are 
combined into a record section for the epicentral distance range 
9-40°. The travel time curves show two triplications, one near 
15° due to the 410 km discontinuity, and another around 22° 
due to the 660 km discontinuity. Travel times and synthetic 
seismograms predicted by the velocity structure (GCA) derived 
from the data fit the data well, including the back branches (C- 
B and D-E) of the triplications. The effects of the discontinuities 
appear in the p( A) data as two groups of later arrivals for which 
p increases with A. These arrivals are the back branches of the 
triplications (Fig. 3.4-6). The remaining arrivals show p de¬ 
creasing with A, and thus are the forward branches. 

Figure 3.5-14 compares the GCA model to upper mantle 
models for other tectonic environments: ARC-TR (arc-trench) 
for the Japan subduction zone, T7 for the tectonically active 
western portion of North America, and K8 for the stable 
Eurasian shield. Above 200 km, all show a LVZ overlain by 
a higher-velocity lid, but the depth and extent of the LVZs 
differ. The shield model, for example, has the thickest lid. 
Below 200 km, GCA shows the lowest velocity. The depths of 


the 410 and 660 km discontinuities differ between the models. 
These differences are thought to reflect the fact that the mineral 
phase transformations causing the discontinuities occur at 
pressures (and hence depths) that depend on temperature. Thus 
lateral temperature changes, especially those associated with 
subduction zones, should change the depths at which these 
transitions occur (Section 5.4.2). 

Waveform modeling provides additional information about 
the transition zone. For example, waveform modeling of 
intermediate-period S waves shows a discontinuity at about 
520 km depth that is not observed with short-period P waves. 
The phase transition thought to cause this discontinuity may 
occur over a greater depth range than for the 410 and 660 km 
discontinuities, making it visible only to longer-period waves. 

3.5.4 Lower mantle structure 

Velocities increase rapidly with depth for roughly 100 km 
beneath the 660 km discontinuity, but then increase more 
slowly. The rapid increase implies that mineral transforma¬ 
tions continue, whereas the slow increase implies that the 
mineralogy and composition of the material are not changing 
significantly, and that the velocity increases are primarily 
due to the material being compressed by higher pressure. 
However, weak seismic discontinuities have been reported at 
a variety of depths such as 900 and 1300 km. These may repre¬ 
sent either global discontinuities like the 410 and 660 ones, 
or local velocity anomalies, perhaps due to fragments of old 
subducted slabs. 

The situation changes dramatically in the D" layer at the very 
base of the mantle, a fascinating and poorly understood region 8 
that has a velocity structure whose complexity rivals that of 
the lithosphere. D", the bottom few hundred kilometers of the 
mantle, was initially differentiated from the rest of the mantle 
(D') because the velocity gradient with depth is lower. This 
lower gradient is expected, because D" is a thermal boundary 
layer between the mantle and hotter core. The expected 
~1000°C temperature difference across D" would lower 
velocities and thus decrease the velocity gradient. 

However, detailed velocity models show that at the top 
of this lower-gradient region the velocity increases sharply 

8 The uncertainties about D" have been illustrated by describing its thickness as 
250±250 km (Jeanloz, 1990). 
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Fig. 3.5-13 Seismic array study of upper 
mantle structure. Top: Record sections, 
plotted with a reducing velocity of 107s, 
showing observed (left) and synthetic (right) 
seismograms. Bottom left: Reduced travel 
time plot, showing travel time data and 
model predictions. Bottom right: p( A) 
plot and model predictions. The two 
triplications are evident in the record 
sections, travel time plot, andp(A) plots. 

The slight break in the travel time curve at 
13° is due to use of slightly different models 
(GCA'versus GCA). (Walck, 1984.) 



Fig. 3.5-15). A feature of such models is that the high- and 
ow-velocity regions trade off to give similar travel times as 
>REM, which does not contain the high-velocity region. Thus 
y f is now often delineated by the location of the discon- 
inuous velocity increase, which averages about 250 km above 
he CMB. This is ironic, in that D" was first named for a region 
>f lower than expected velocities. 

Observations of the velocity increase, known as the D" dis- 
ontinuity, are usually made with the phases PdP and SdS , each 
if which combines waves that reflect off and refract just under 


the discontinuity (Fig. 3.5-16). PdP and SdS arrive between the 
direct (P and S) and core-reflected (PcP and ScS) phases, as 
shown. The discontinuity has been observed at many locations 
on the CMB, but other locations, even nearby, do not show 
a PdP or SdS arrival. Moreover, although the average depth 
of the discontinuity is 250 km above the CMB, the observed 
depths range from 100 to 450 km above the CMB. 

One possible explanation for this variability is that the dis¬ 
continuity has large topographic variations over small spatial 
wavelengths that focus and defocus waves. Another possibility 
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Fig. 3.5-14 Comparison of model GCA, derived from the data in Fig. 3.5- 
13, to P-wave velocity models for other tectonic regions. (Walck, 1984.) 
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Fig. 3.5-15 Velocity structures from several studies showing an increase 
in velocity about 250 km above the core-mantle boundary, known as the 
D" discontinuity. (Wysession etal., 1998. The Core-Mantle Boundary 
Region, 215-91, copyright by the American Geophysical Union.) 
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Fig. 3.5-16 Top : Schematic ray paths of the two arrivals making up 
the phase SdS . Sbc reflects off the D" discontinuity, and Scd refracts 
just below it. (Wysession etal., 1998. The Core-Mantle Boundary 
Region, 273-97, copyright by the American Geophysical Union). 
Bottom : Observed Scd arrivals (arrows) (left) compared to synthetic 
seismograms (right) computed using velocity model SYLO (Fig. 3.5-15). 
(After Young and Lay, 1990./. Geophys. Res., 95, 17, 385-402, 
copyright by the American Geophysical Union.) 


is that there is no actual discontinuity, but that complex three- 
dimensional velocity heterogeneities give the appearance of a 
discontinuity. This possibility is supported by observations of 
the increased scattering of seismic waves passing through D". 
In either case, it is possible that the increase in velocity is associ¬ 
ated with subducted lithosphere that sank to the bottom of the 
mantle. There is a correlation between regions of fast velocities 
in D" and the projected locations of fossil slabs from ancient 
subduction zones (Fig. 3.5-17), which should retain a cold 
thermal anomaly for a long time after reaching the CMB 
(Section 5.4.1). Seismic modeling suggests that this mechanism 
could generate PdP and SdS phases. 

D" shows additional complexities. There is strong evidence 
for significant seismic anisotropy (Section 3.6.6). Large lateral 
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Fig. 3.5-17 P -velocity variations at the 
base of the mantle. Dark areas represent 
anomalously fast velocities, and light areas 
are slow anomalies. The fast anomalies 
correlate with the predicted locations 
of lithosphere subducted during the 
Mesozoic that sank to the base of the 
mantle. (Wysession, 1996b. Reproduced 
with permission from Nature.) 



'ig. 3.5-18 Ray paths of SPdKS, a phase that is highly sensitive to the 
dtra-low-velocity zone at the base of the mantle. As with many studies 
»f the deep mantle and core, it is analyzed using the difference between 
:s travel time and another phase — in this case, SKS. 


ariations at both small and large spatial wavelengths occur for 
elocities within D" and for topography on the CMB. There is 
Iso evidence for an ultra-low-velocity zone (ULVZ) at the very 
ottom 10—20 km of the mantle. The ULVZ is observed with 
n unusual body wave phase, SPdKS , which is similar to SKS 
ut travels partly as a diffracted P wave at either or both of 
:s entrance and exit points from the core. SPdKS appears as 
shoulder of the SKS arrival, and is very sensitive to the 
-wave velocity structure just above the CMB (Fig. 3.5-18). 
lodeling of SPdKS waveforms suggests that v p may be 10% 
>wer than in the rest of D", and the reflection coefficients of 
cP precursors that reflect off the top of the ULVZ suggest that 
. may decrease by 30%. The ULVZ may result from partial 
ielt, because it is most prominent where D" velocities are 
owest, implying that the high temperatures causing the low 
docities may also cause more partial melting. 

In summary, much uncertainty remains about the detailed 
ructure of D" and its causes. This is hardly surprising, because 


the CMB is likely to be the site of many processes involving 
lateral and vertical motions and vigorous chemical reactions. 
An analogy might be that D" is a thermal boundary between 
the mantle and the core, analogous to the lithosphere, which 
is the thermal boundary layer at the top of the mantle. The 
high-velocity layer at the base of D" may be a chemical layer, 
analogous to the crust. These complexities have led the CMB 
to be called the graveyard of ancient ocean lithosphere, the 
birthplace of mantle plumes, and the region that most signific¬ 
antly controls the outer core convection patterns and thus the 
earth’s magnetic field. The fact that we study this region largely 
via seismic remote sensing” through 2890 km of heterogene¬ 
ous mantle may limit the degree to which it can be understood. 9 

3.5.5 Visualizing body waves 

To end our discussion of body waves, it is worth considering 
their physical nature. We have treated body wave arrivals like S 
and ScS as geometric rays. However, although it is convenient 
to describe these waves as rays and to show their paths through 
ray tracing, this approximation does not fully describe their 
behavior. 

To see this, we consider a numerical simulation showing 
time snapshots of the SH shear wave field generated by a 
600 km-deep earthquake (Fig. 3.5-19). The wave field is 
synthesized by summing 28,000 torsional normal modes 
(Section 2.8) with periods above 12 s. The calculations show 
accurate relative amplitudes, with light and dark shades repres¬ 
enting displacements into and out of the paper, respectively. 
Although the normal mode solution is itself an approximation 
to the actual wave field in the laterally heterogeneous earth, it 
is much closer to reality than geometric rays. 

9 The geophysical significance of the CMB and the large uncertainties remaining 
about it are summarized by D. Stevenson’s description of D" as “the sum of all of our 
ignorance of the interior of the earth.” 






Fig. 3.5-19 Snapshots of a synthetic SH wave field showing the propagation of waves after a 600 km-deep earthquake. The initial wave front moves 
away from the source at the lower left side of the figures. The wave front develops complexity due to interactions with the surface, CMB, and internal 
discontinuities and velocity gradients. The wave field is computed using the spherically symmetric PREM velocity model. Amplitudes are raised to 
a power of 0.8 to enhance smaller signals. (After Wysession and Shore, 1994.) 
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As shown, a single spherical S wave front is quickly broken 
into various wave fronts by reflections off the surface, mantle 
discontinuities, and the core. As the wave fronts arrive at the 
surface, they cause arrivals that we call 5, ScS, sS, etc. 

In the first frame, Fig. 3.5-19a, which is 60 s after the earth¬ 
quake, the wave front maintains much of its initially spherical 
shape. The upgoing part of the wave front is headed toward the 
surface, but will not reach it for another 67 s. The downgoing 
part of the wave front is headed toward the core, where it will 
be fully reflected and give rise to ScS. 

In Fig. 3.5-19b, 300 s after the earthquake, the wave front 
still maintains its integrity, though the upper part is now re¬ 
flecting off the surface, and the lower part is about to reach the 
:ore. The slower upper mantle velocities cause bends in both 
:he reflected and the unreflected waves. The S wave front is 
reaching the surface 12.5° away from the source, and at closer 
distances has already reflected downward. When these down¬ 
ing waves reach the surface again, they will be called the 
;S and sScS phases. The downgoing portion of the initial wave 
ront that will become ScS has not yet reached the core. 

By 600 s after the earthquake (Fig. 3.5-19c) added complex- 
ty is evident. The upgoing wave that reflected at the core will 
generate ScS and its multiples (ScS2, ScS3, etc.). The surface- 
eflected wave is separating into two parts. One is heading into 
he lower mantle and will eventually reach the surface as the 
ScS and sS phases. The other will turn higher up in the mantle 
.nd arrive at the surface as the SS phase. Behind the sS, ScS, 
nd sScS wave fronts are upper mantle echoes reflected from 
he 220,400, and 670 km discontinuities. However, despite all 
hat is going on, the only phase yet recorded at the surface is S, 
low arriving 31° away from the source. sS will begin to arrive 
i another 63 s, at a distance of 24°. 

By 900 s after the origin time (Fig. 3.5-19d), four segments of 
be broken wave front are reaching the surface: S at 52°, sS at 
9°, SS at 38°, and ScS at 33°. The sS and SS wave fronts have 
egun to separate. In contrast, the ScS and S wave fronts have 
egun to come back together because they enter the core 
badow, where the S/ScS wave front continues as a diffracted 
di ffWzve. Behind the S and ScS waves in the lower mantle are 
le sS and sScS waves, which follow similar paths except for 
leir surface reflections. The distance between S and sS (and 
Iso between ScS and sScS) is a function of the depth of the 
arthquake. The three wave front segments labeled SS form 
characteristic “Y” shape that results from the waves turn- 
Lg in the mid-mantle. The “Y”’s junction represents the 
iperposition of the part of the wave front that is heading 
own toward the bottoming point and the part of the wave 
ont that has already turned and is heading back up again, 
shind SS, the phase SSS that bounces twice on the underside 
f the surface, is beginning to form. 

In Fig. 3.5-19e, 1200 s after the earthquake, most of the 
itial S wave front is actually S di ^ because S grazes the core at 
)out 100°. The surface-reflected sS wave now also diffracts 
■ound the core as sS di ^. SSS is now fully developed, and is 
aching the surface behind SS. The polarity of SSS is different 


from that of SS, because each successive surface bounce 
changes its phase by nil (Section 3.5.1). The initial 5-wave 
polarity is into the page (light-colored), whereas SSS is prim¬ 
arily out of the page (dark colored) because it has been 
phase-shifted twice. The smaller-amplitude phases evident are 
reflections from the upper mantle discontinuities in the velo¬ 
city model at depths of 220, 400, and 670 km, and so come in 
threes. One set of these, labeled as ScS 220 S, ScS 400 S, and 
ScS 670 S, are underside reflections that precede ScS2. 

By 1500 s after the earthquake (Fig. 3.5-19f), the initial wave 
front is entirely diffracted S diff’ reaching the surface at a dis¬ 
tance of 111 0 . Because waves travel much faster at the base of 
the mantle than in the upper mantle, S di ^ at the CMB has gone 
further, reaching 152°. A set of mid-mantle reflections labeled 
^ 220^5 *^ 400 ‘■F an< ^ *^ 670 which are also visible in the previous 
panel, appear ahead of SS. These peel off the upgoing S/S di ^ 
wave front as it interacts with the discontinuities. Because they 
are related to SS, they also have the “Y”-shape characteristic 
of underside-reflected phases. The upgoing parts of the “Y” 
formed from the upgoing S phase, but the downgoing parts 
(right side of the “Y”) peel off S diff and are better called 

Sdiff200 S diff> S diff400 S diff> and S diff670 S diff' The waves with the 
largest amplitudes, SS and SSS, are arriving at the surface at 
distances of 76° and 63°. 

In Fig. 3.5-19g, 1800 s after the earthquake, 54 has begun 
to be observed at the surface (71°), following SS (97°) and 
SSS (83°). The next surface reflection, 55, is now developing. 
The ScS2 multiple reflection is arriving at the surface 36° from 
the earthquake. The downgoing part of 55 is from S di ^ reflect¬ 
ing at the surface, so it will arrive at the surface at distances 
greater than 200° as the phase S diff S diff . By now, 30 minutes 
after the earthquake, seismic energy has spread throughout 
the mantle. Multiple ScS waves are still reverberating between 
the surface and the core. At the CMB, the leading S diff wave 
has wrapped around the antipode and is heading back toward 
the epicenter. 

This simulation illustrates that although the ray paths used 
to describe body waves in the earth are intuitively appealing 
and useful, they are simple ways of characterizing a complic¬ 
ated wave field. An earthquake generates an initially spherical 
wave front whose interaction with various interfaces gives rise 
to many wave fronts. We use names for the arrivals that the 
wave fronts cause at the surface, so different parts of the same 
wave front, or the same part at different times, are given differ¬ 
ent names. Hence our intuition based on geometric rays can 
lead us to miss some of the richness that occurs. For example, 
we tend to view diffraction as an exotic effect different from the 
direct ray path, but the simulation shows no major change as 
the direct wave becomes the diffracted wave, although there 
is a loss of high frequencies. Hence the simulation shows no 
obvious core shadow zone, because seismic energy reaches 
the shadow zone by diffraction and multiple reflections. The 
essential point is that the wave fields are the physical entities, 
whereas rays are useful approximations whose limitations 
should be kept in mind. 
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3.6 Anisotropic earth structure 

3.6.1 General considerations 

So far in this chapter, we have considered a view of the earth 
developed from analyses of seismic waves assuming that they 
propagated through an earth made up of purely isotropic, 
linearly elastic material (Section 2.3.9). In such material, the 
stresses are linearly proportional to the strains via Hooke’s law 

°ir c W e ki' ^ 

and the 81-term tensor of elastic moduli, c ijkl , reduces to two 
independent elastic constants, X and p. As a result, the material’s 
elastic properties are the same in all directions. Although isot¬ 
ropy is a good first approximation in the earth, it is sometimes 
important to consider deviations from isotropy, or anisotropy. 
In such cases, Hooke’s law applies, but the relation between the 
stresses and strains involves more than two elastic constants. 
Although there can be up to 21 independent elastic constants, 
any material in which more than two are needed is called 
anisotropic. 

Having more than two elastic constants means that the 
material’s properties differ depending on the direction. Because 
seismic wave velocities depend on the elastic constants, waves 
traveling through anisotropic material travel faster or slower 
depending on their direction, and complicated wave phenom¬ 
ena can occur. For example, a shear wave can be split into two 
pulses, each with a different polarity and traveling at a different 
speed (Figs 3.6-1,2.4-8). 

Anisotropy can result from a material’s being non-uniform, 
a condition called heterogeneity or inhomogeneity. A common 
situation is when material has directionality in its structure. For 
instance, plywood is a superposition of thin layers of wood, 
so its strength (shear modulus) differs in different directions. 



Fig. 3.6-1 Schematic of an initially polarized shear wave split along the 
fast and slow anisotropic directions, yielding pulses separated in time. 
The pulses remain split after leaving the anisotropic region. 


Similarly, a stack of rock layers with different isotropic velo¬ 
cities can as a whole behave anisotropically, so seismic waves 
travel with different speeds parallel or perpendicular to the 
layers. This situation is called shape-preferred orientation 
(SPO) anisotropy. Anisotropy can also occur for homogeneous 
materials. For example, the crystal structure of the mineral 
olivine is homogeneous in that it is composed of the same 
repeating groups of atoms, but acts anisotropically because its 
acoustic properties vary in different directions relative to the 
crystal lattice. This situation is called lattice-preferred orienta¬ 
tion (LPO) anisotropy. 

The anisotropic variations of the seismic velocity of earth 
materials are small compared to the large changes in seismic 
velocity that occur radially from the surface to the core. Hence, 
in developing radial models of seismic velocity, anisotropy has 
traditionally been treated as a secondary effect. Nevertheless, 
recent efforts to better quantify three-dimensional velocity varia¬ 
tions sometimes find that anisotropic perturbations are com¬ 
parable to lateral velocity changes. It is often difficult, however, 
to distinguish between the effects of anisotropy and those of 
heterogeneity. For example, curvature on a refracting interface 
can simulate many of the effects associated with anisotropy. 

An important reason to study anisotropy is that material flow 
at depth appears to preferentially orient olivine crystals within 
upper mantle rocks. Hence mapping the seismically “fast” 
direction lets us investigate the relation between plate motions 
and mantle flow at depth. Although anisotropy studies are 
ongoing, and both results and interpretations will change over 
time, they represent a major frontier in deep earth studies. 


3.6.2 Transverse isotropy and azimuthal anisotropy 

As discussed in Section 2.3.9, the symmetry of the stress and 
strain tensors and the idea of strain energy means that no more 
than 21 of the 81 elastic constants c ijkl are independent. We can 
thus write the c ijkl tensor as a matrix C mn . , where the indices 
m and n vary from 1 to 6 as the pairs of indices (/', /') or (£, /) 
take values of (1, 1), (2, 2), (3, 3), (2, 3), (1, 3) and (1, 2), 
respectively: 
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For an isotropic material, the c- kl tensor can be written in terms 
of two independent elastic constants 

c ijki = tySki+ s ik 5 n + S a S jk ), (3) 

so its matrix form is 

X + 2ji X X 0 0 0^ 

X X + 2 fi X 000 
n — X X X + 2(i 0 0 0 .. 

mn ~ o 0 0 0 0 * (4) 

0 0 0 0 (I 0 

, 0 0 0 0 0 Hj 

Ffowever, the crystal structures of many earth materials re¬ 
quire additional independent elastic coefficients. For example, 
ce, quartz, olivine, or plagioclase feldspar require 5, 6, 9, and 
11 constants, respectively. In such cases, the matrix is more 
implicated. 

One of the most important forms of anisotropy, known 
is transverse isotropy (also known as radial anisotropy, 
ixisymmetry, and cylindrical symmetry), occurs for a stack of 
ayered materials. Each layer is isotropic in its properties, but 
hese properties differ between layers (as in plywood). Thus the 
lastic properties, and hence seismic velocities, of the stack as 
whole are identical regardless of the amount of rotation 
bout the axis of symmetry, which is perpendicular to the layers, 
lowever, these aggregate properties differ in the perpendicular 
lirections. 

A transversely isotropic material can be characterized by five 
idependent elastic coefficients, A, C, F, L, N, that represent its 
ggregate properties. If the axis of symmetry is x 3 , so properties 
i that direction differ from those in the x 1 -x 2 plane, the elastic 
onstant matrix (Eqn 4) becomes 

' A A - 2N POO (P 
A-IN A F 000 

' = F F C 0 0 0 , c , 

0 0 0 L 0 0 • (5) 

0 0 0 0 L 0 

, 0 0 0 0 0 N, 

dmparisons with matrices 2 and 4 show that terms that were 
le same for an isotropic material (consider C u and C 33 , or C 55 
1( ^ C 6 g) now differ, because terms involving the x 3 direction 
iffer from those in the or x 2 directions. 

This matrix gives the velocities of waves propagating in dif- 
rent directions. First, consider waves propagating in the x 1 
rection (Fig. 3.6-2, top). By analogy to the isotropic case, 
corresponds to X + 2fi for the x t direction, N corresponds 
> fi for the x 2 direction, and L corresponds to (i for the 
i direction. Thus the P velocity and the two orthogonal S 
ilocities are 



Fig. 3.6-2 Cartoon showing the effects of transverse isotropy due to 
layering. Top : Directions of oscillations for P and S waves propagating in 
the x t direction, in the plane of layering. The shear wave oscillating in the 
plane of the layering has velocity 5 1? which is generally faster than that 
for the shear wave oscillating across the layers, S 2 . Bottom: Directions 
of oscillations for P and S waves propagating in the direction, 
perpendicular to the layering. The compressional wave velocity, P 2 , is 
generally less than P v Both shear waves have the same velocity, S v 


P t = (A/p) 112 , S t = (Nip) 112 , S 2 = (Lip) 112 , (6) 

Hence the velocity of shear waves traveling in this direction 
depends on the directions of their particle motions. The waves 
become split, with waves polarized in one plane traveling 
faster than those polarized in the other. This is one way to get 
splitting like that shown in Fig. 3.6-1. These results would 
be the same for propagation in the x 2 direction, or any other 
direction in the x x -x 2 plane, because physical properties in this 
plane are independent of direction. 

In many applications, the horizontally layered earth shows 
transverse isotropy about a vertical axis. The SH -wave velo¬ 
city S 1 is generally faster than the SV velocity S 2 , because the SH 
displacement is preferentially in the fast layers, whereas SV 
samples both equally. An interesting consequence is that the 
shear velocity inferred from the dispersion of Love waves, 
which are SH waves, would be higher than that from Rayleigh 
waves, which involve SV. 
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By contrast, for P and S waves propagating in the x 3 (axis of 
symmetry) direction (Fig. 3.6-2, bottom ), both S velocities 
equal S t in Eqn 6. The P velocity reflects the fact that C corres¬ 
ponds to 2 + 2 ji for the x 3 direction, so 

P 2 = (C/p) m . V) 

For layered materials, typically P 1 > P 2 , so P waves propagate 
faster in the x t direction than in the x 3 direction. This is because 
the wave travels preferentially in the fast layers in the x 1 direc¬ 
tion, whereas a P wave traveling in the x 3 direction must also 
traverse the slow layers. 

Transverse isotropy is often characterized by three 
parameters: 

Z = N/L = (S 1 /S 2 ) 2 , 0=CM = (P 2 /P 1 ) 2 , ri = N/(A-2L). (8) 

If the material were isotropic, £ = </> = rj = 1. For layered struc¬ 
tures, generally £ > 1 and <j> < 1 . 

A second common type of anisotropy is azimuthal 
anisotropy , in which velocities vary as a function of horizontal 
direction. One way to obtain this is to have transverse isotropy 
with the x 3 axis turned to horizontal, which is analogous to 
standing plywood vertically. In general, the P-wave velocity 
varies with azimuth as 

P{9) = A l +A 2 cos 20+A 3 sin 2 6+A 4 cos 40+A s sin 40, (9) 

where the constants A ■ depend on the 21 elastic constants. 


3.6.3 Anisotropy of minerals and rocks 

An important source of seismic anisotropy is minerals that are 
anisotropic due to their crystal structure. At microscopic levels 
the anisotropy can be enormous, with velocities along different 
mineralogical axes varying by more than 100%. Generally, 
however, the anisotropic mineral grains are randomly oriented, 
so seismic waves have wavelengths long enough to average out 
the anisotropic effects, leaving only weak anisotropy. However, 
in some cases the mineral grains are aligned, causing significant 
anisotropy. 

Laboratory studies of the elastic moduli of minerals give 
insight into such LPO anisotropy. Some studies involve static 
methods like twisting or squeezing samples, but most use the 
vibrational properties of mineral samples as small as 1 mm. At 
very high pressures, a technique called Brillouin scattering, 
which measures how laser light passing through the mineral is 
distorted, yields elastic constants for samples smaller than 
0.1 mm. 

One of the most important anisotropic minerals is olivine 
(Fig. 3.6-3), which comprises much of the upper mantle 
(Section 3.8). For waves propagating in the fastest direction, 
the P -wave velocity is 9.89 km/sec and the S velocities are 
4.89 km/s and 4.87 km/s. By contrast, the slowest P velocity in 
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Fig. 3.6-3 P and S velocities (km/s) in different directions relative to 
the crystal structure of olivine. P velocities are in the directions of the 
dashed lines, and the S velocities are shown by the adjacent pairs of 
perpendicular lines. The a axis, corresponding to the [100] crystal face, 
is the fastest direction through the crystal. It is also the dominant slip 
direction, so olivine crystals align in the direction of plastic flow. 
(Babuska and Cara, 1991. With kind permission from Kluwer 
Academic Publishers.) 


this example is 7.72 km/s. The magnitude of anisotropy is 
characterized by 

^ ~ (^max ~ ^min^Gnean* (f 0) 

For P-waves in the olivine crystal, a max = 9.89 km/s, a min = 
7.72 km/s, and a mean = 8.81 km/s, so k = 25%. The maximum 
and minimum S velocities are 5.53 km/s and 4.42 km/s, so 
k = 22%. Although for olivine the anisotropy of P and S waves 
is similar, they can differ greatly for other minerals. 

Other important minerals range from nearly isotropic to 
extremely anisotropic. One of the most isotropic minerals 
is garnet, where k for both P and S waves is < 1 %. At the other 
extreme, sheet silicates like mica can have values of k up to 
60% for P waves and 116% for S waves. 

As a result, a major factor controlling a rock’s anisotropy is 
the anisotropy of the minerals composing it and their relative 
proportions. Another important factor is the presence of 
deviatoric stresses, which can cause a preferred orientation of 
anisotropic mineral grains that might otherwise be randomly 
distributed. Crystals are generally oriented with their smallest 
widths in the direction of maximum compression. For example, 
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i micas, which are important components of highly foliated 
:hists, the flat crystals are oriented parallel to the plane of 
ast compression. Thus slip occurs more easily parallel to the 
sveloping foliation, because the planar mica faces contain 
le weakest bonds. Shear in a preferred direction can also 
crystallize different mineral assemblages, so the resulting 
lisotropy reflects a combination of the preferred orientation 
f anisotropic materials and the presence of laminar structures. 

6.4 Anisotropy of composite structures 

nisotropy can also result from an asymmetric combination 
: materials. The upper continental crust often contains hor- 
ontally layered sedimentary rocks. Similarly, oceanic crust is 
>mprised of sediments overlying layers of basalt and gabbro. 
ich layering can yield transverse isotropy, with the symmetry 
ds oriented vertically. On a regional scale, plate collisions often 
mse significant metamorphism, sometimes yielding transverse 
otropy due to the preferred orientation of the foliation of 
leisses and schists. 

Fluid-filled cracks, for example in a volcanic region, can also 
use anisotropy. For a material containing two-dimensional 
lid-filled cracks whose normals are parallel to the x 1 axis, the 


lisotropy is given by 




X 


( X + Ip 

X 

X 

0 

0 

0 > 


X 

X + IjLt 

X 

0 

0 

0 


X 

X 

X + 2p 

0 

0 

0 

i 

0 

0 

0 

p 

0 

0 


0 

0 

0 

0 

JU(1 - £) 

0 


l 0 

0 

0 

0 

0 

p(l-£)j 


( 11 ) 

here £ is the crack density given by £= Na 3 fV , N is the number 
cracks in the volume V, and a is the half-width of a crack. If 
e cracks become infinitely small, £ = 0, yielding the isotropic 
se (Eqn 4). In general, the anisotropy depends on the geo- 
stry of the inclusions and their contrast in properties with the 
rrounding matrix. For computational ease, rods (prolate 
heroids) and disks (oblate spheroids) are often assumed in 
ismic modeling. 


Mid-ocean 

ridge 




6.5 Anisotropy in the lithosphere and the asthenosphere 

lisotropy in the lithosphere takes many forms, including that 
glaciers whose flow aligns the ice crystals. Closer to our 
plications, several effects generate anisotropy in the oceanic 
rst. Horizontal sediment layers can create transverse isotropy 
up to 15% with a vertical symmetry axis. In the upper crustal 
rer of vertical-sheeted basaltic dikes, azimuthal anisotropy is 
Dught to exist with a horizontal axis perpendicular to the 
ces and thus in the spreading direction. 

Sub-crustal oceanic lithosphere shows strong azimuthal 
isotropy. The flow processes associated with plate spreading 


Fig. 3.6-4 Top: Illustration of how the spreading process yields a 
preferred orientation of olivine crystals in the oceanic lithosphere, with the 
fast axis of velocity ([100]) in the spreading direction. Bottom: Variations 
in P n wave velocities near Hawaii. The azimuth is measured relative to the 
trend of the isochrons (90° from the spreading direction), so the maxima 
at 90° and 270° show that the fast direction of the azimuthal anisotropy is 
in the direction of spreading when the plate formed. (Morris etal., 1969. 

/. Geophys. Res., 74, 4300-16, copyright by the American Geophysical 
Union.) 
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Fig. 3.6-5 Left: Seismic reflection profile 
of the crust and upper mantle in eastern 
Australia. The lower crust has multiple, 
discontinuous, and sub-horizontal reflectors 
possibly due to strain-induced fabrics, 
igneous layering, or free fluids. This 
structure yields vertical-axis transverse 
isotropy. (Finlayson etal., 1989. Properties 
and Processes of Earth’s Lower Crust , 1-16, 
by permission of Australian Geological 
Survey Organisation.) Right: Schematic 
cross-section of the crust in the northern Ruby 
Mountains of the North American Basin and 
Range. There is a strong tendency toward 
horizontally layered features, although the 
likely origins of such fabrics vary with depth. 
(Smithson, 1989. Properties and Processes 
of Earth’s Lower Crust , 53-63, permission 
as above.) 
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(Fig. 3.6-4, top) appear to orient olivine crystals preferentially 
in the spreading direction, along their [100] slip axes. 1 Because 
P waves propagate fastest in this direction (Fig. 3.6-3), P n head 
waves that sample the uppermost mantle just below the Moho 
(Section 3.2.1) show a strong azimuthal velocity dependence 
(Fig. 3.6-4, bottom ). This variation is approximately described 
by the cos 26 term in Eqn 9, where 6 is measured from the 
spreading direction, so the velocity is highest in the spreading 
direction or 180° from it. This anisotropy is “frozen in” as the 
lithosphere ages, and so records the spreading direction. 

Because continental crust is more complicated than oceanic 
crust, so is its anisotropy. A primary source of anisotropy in 
the upper crust is the presence of cracks, often fluid-filled. 
Such cracks often have a near-vertical orientation induced by 
regional stress fields parallel to the cracks. When these cracks 
occur in horizontal sediments that would by themselves have 
vertical-axis transverse isotropy, the combined result can be 
orthorhombic symmetry. The lower continental crust tends to 
have strong sub-horizontal layering, perhaps resulting from 
ductile deformation, which causes seismic anisotropy. Fig¬ 
ure 3.6-5 shows such layering in a seismic reflection profile and 
a schematic diagram. 

Anisotropy within and beneath continental lithosphere is 
often studied with a technique called shear wave splitting. When 
SKS waves convert from P waves in the outer core to S waves in 
the lower mantle, they are entirely polarized in the radial (5V) 
direction, because all the initial SH energy was reflected when 
the downgoing S wave encountered the core-mantle boundary. 
As these shear waves travel across the mantle and crust, how¬ 
ever, they can be split when traveling through anisotropic media 
(Fig. 3.6-6). Assuming transverse isotropy with a horizontal 
axis of symmetry, the two polarized waves travel at different 


1 This representation of crystallographic axes is discussed in mineralogy texts like 
Klein and Hurlbut (1985). 
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Fig 3.6-6 Splitting of an incoming shear wave into pulses oriented along 
the fast (sj) and slow (s 2 ) directions of anisotropy. The polarization 
angle (j) gives the rotation of the fast axis relative to the radial propagation 
direction, and St is the time difference between the split pulses. 


speeds and arrive at different times. Thus, if the SKS signal on 
the radial component in an isotropic earth is s(t), its projection 
into the fast and slow polarizations is, respectively, 

s 1 {t) = s{t) cos (j), s 2 {t) = s(t- St) sin (j), (12) 

where 0 is the polarization angle between the radial direction 
and the fast axis, and 8t is the delay time between the fast 
and slow polarizations. We would normally not expect any 
SKS on the transverse component, but anisotropy yields a com¬ 
bination of both the fast and the slow polarizations on both the 
radial and the transverse components, given by 

R(t) = s{t) cos 2 (p + s{t-8t) sin 2 0, 

T(t) = [(s(t) -s(t-dt))!2] sin 20. (13) 

For example, in Fig. 3.6-7a ( top ), SKS appears on the trans¬ 
verse component. The two components are rotated to yield the 
fast and slow polarizations, s t (t) and s 2 {t) (Fig. 3.6-7a, middle). 
The time shift St is then applied, and the signals are rotated 
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Fig. 3.6-7 Shear wave splitting of SKS 
waves for a Kuril Islands earthquake, 
stacked across an array of seismometers in 
New Zealand, a: SKS waveforms before 
and after processing. Top: radial and 
transverse components before processing. 
Note the large SKS signal on the transverse 
component, which should not be there 
for an isotropic earth. Middle: SKS 
waveforms after rotation into the fast and 
slow polarizations. Bottom: SKS waveform 
after the splitting has been removed so that 
all SKS is on the radial component, b: 
Particle motion plots (Section 2.4) of SKS 
on the radical and transverse components 
before and after removal of the transverse 
signal, c: Contour plot of the amplitude in 
the radial component as a function of the 
delay time and polarization angle. The 
minimum corresponds to the best-fitting 
value. (Gledhill and Gubbins, 1996. Phys. 
Earth Planet. Inter., 95, 227-36, with 
permission from Elsevier Science.) 


again so that all of the signal appears on the radial component 
(Fig. 3.6-7a, bottom). As shown in Fig. 3.6-7b, before cor¬ 
rection, particle motion occurs on both components, but after 
:orrection, the motion is limited to the radial component. The 
fact that this technique removes the transverse signal shows 
the appropriateness of the transversely isotropic model. The 
values of (p and St are found by minimizing the transverse 
signal, as shown by the contour plot in Fig. 3.6-7c. Typical 
values for the magnitude of shear wave splitting, St, are in the 
0-2 s range. 

Seismic anisotropy within continents is thought to reflect 
:rystal alignment created during a tectonic episode and then 
“frozen in.” The anisotropy is a result of the last episode 
of tectonism, which resets any previous anisotropy. Because 
continental rock can be as old as 4 Ga (the mean age is about 
1.5 Ga), anisotropy in continental lithosphere can reveal 
information about very old tectonic events such as episodes of 
mountain building. For plate collisions the fast axis is usu¬ 
ally sub-perpendicular to the principal stress axis, or parallel to 
the resulting orogenic belts. There may also be deeper aniso¬ 
tropy due to oriented olivine in the flowing asthenosphere. 
However, it is sometimes difficult to distinguish this effect 
from lithospheric anisotropy. For instance, in eastern North 
America the fast axis is oriented WSW-ENE, parallel to the 
direction of both absolute plate motion (Section 5.2.4) (and thus 
presumably asthenospheric flow) and major orogenic bounda¬ 
ries like the Appalachian Mountains (Fig. 3.6-8). 

Surface wave observations indicate that anisotropy extends 
to a depth of about 300 km beneath oceans. The S-wave veloc¬ 
ity inferred from Love waves, which are SH waves, is higher 
than inferred from Rayleigh waves, which involve SV. Figure 
3.6-9 shows the squared S-wave velocity ratio t, (Eqn 8) versus 


depth for several ages of oceanic lithosphere. The deviation of £ 
from 1 reflects transverse isotropy with SH velocities faster 
than SV velocities. Because the oceanic lithosphere extends to a 
depth of about 100-125 km, anisotropy seems to extend into 
the asthenosphere. 

In addition, Rayleigh wave velocities show azimuthal 
anisotropy similar to that found for P n waves that sample the 
uppermost mantle at much shallower depths. Both types of 
anisotropy may reflect mantle flow (Fig. 3.6-4). The flow- 
induced preferred orientation of olivine would give azimuthal 
anisotropy in the spreading direction. Taking paths in different 
directions averages out the azimuthal effect, leaving a net 
transverse isotropy that is symmetric about the vertical. An 
interesting consequence of this model is that near the ridges, 
where mantle material is upwelling, transverse isotropy should 
be less significant, as the data show. At older ages, mantle flow 
will be more horizontal, increasing transverse isotropy. 

3.6.6 Anisotropy in the mantle and the core 

Although most of the mantle shows little or no anisotropy, 
this is not so for the D" region at the base of the mantle, where 
complex interactions with the liquid outer core may occur 
(Section 3.5.4). Studying anisotropy in a narrow layer nearly 
3000 km below the heterogeneous mantle and crust is challeng¬ 
ing, but initial investigations suggest anisotropy on the order of 
several percent, comparable to the isotropic velocity variations. 
D" anisotropy seems to fall in to two categories. Beneath regions 
of paleo-subduction, such as western Central America and the 
northern Pacific rim, SH waves in the form of 5, ScS, or S di ^ 
travel faster than their SV counterparts (Fig. 3.6-10). This 
behavior has been modeled as transverse isotropy. However, 








Fig. 3.6-8 Map of the eastern USA showing 
shear wave splitting results from SKS and 
SKKS. Lines point in the direction of the 
fast axis, assuming horizontally oriented 
transverse isotropy, and the sizes of the 
circles represent the magnitude of the 
splitting in seconds. The background is a 
map of the shear wave velocity anomalies 
at 200 km depth (van der Lee and Nolet, 
1997. J. Geophys. Res., 102, 22, 815-38, 
copyright by the American Geophysical 
Union.) The splitting direction is 
approximately parallel to the Appalachian 
orogenic belts (dashed line) and aligned with 
the absolute plate motion (APM). Note the 
regional variations for different locations. 
(Fouch etaL, 2000. J. Geophys. Res., 105 , 
6255-76, copyright by the American 
Geophysical Union.) 
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Fig. 3.6-10 Evidence for anisotropy at the base of the mantle, shown by 
diffracted arrivals for a South American earthquake recorded at Canadian 
station DAWY. Arrows show estimates of the onset times. Diffracted SH 
arrives before 5 V, suggesting transverse isotropy. (Kendall and Silver, 
1996. Reproduced with permission from Nature.) 


Fig. 3.6-9 Depth variations of the square of the V SH /V sv ratio, beneath 
the Pacific Ocean. £ tends to exceed 1, meaning that SH is faster than SV, 
consistent with olivine in both the lithosphere and the asthenosphere being 
preferentially oriented by the spreading process. (Nishimura and Forsyth, 
1989.) 
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Fig. 3.6-11 Predicted anisotropic behavior for perovskite, periclase, and 
silica as a function of pressure in the mantle. The far right corresponds to 
the lowermost mantle, where these phases are major components. The 
kinks in the silica curve result from phase transitions. (Stixrude, 1998. 
The Core-Mantle Boundary Region, 83-96, copyright by the American 
Geophysical Union.) 


D" anisotropy beneath the mid-Pacific is variable, with SH 
waves usually but not always arriving before the accompany¬ 
ing SV waves. This effect may reflect vertical structures due 
to lower-most mantle upwelling. In addition, several mineral 
phases that are expected here, such as perovskite (MgSi0 3 ), 
periclase (MgO), and the columbite phase of silica (Si0 2 ), 
should be anisotropic under these conditions (Fig. 3.6-11). Be¬ 
cause little of the core-mantle boundary has been examined for 
anisotropy due to the stringent earthquake-station geometries 
required, much is yet to be learned. 

Significant anisotropy occurs in the solid iron inner core. 
PKIKP waves (PKP-DF) travel ~3 s faster in the inner core 
along the earth’s rotation axis than along the equatorial plane. 
The PKP-DF and PKP-BC phases (Fig. 3.5-7) travel similar 
paths through the mantle, so any travel time difference between 
them is likely to reflect structure in the core. Because of the 
low viscosity of the liquid outer core, flow should eliminate any 
lateral velocity variations, including anisotropy. Thus the dif¬ 
ference between the observed differential travel times of the BC 
and DF phases and that predicted by a model 

St BC - St DF =(t BC - t DP ) obserV ed predicted’ (14) 

is likely to be a function of inner core structure along the DF 
path. 

Figure 3.6-12 shows BC-DF residuals versus ft the angle 
between the PKP-DF ray segment in the inner core and the 
earth’s spin axis. Small values of £ correspond to paths parallel 
to the spin axis, and the corresponding large residuals indicate 
that near-axial PKP-DF waves travel faster and arrive sooner. 
Also shown are theoretical predictions for the anisotropic 
behavior of solid iron in the hexagonal close-packed (hep) and 



Fig. 3.6-12 PKP-BC — PKP-DF travel time residuals as a function of ft the 
angle between the PKP-DF ray and the earth’s spin axis. Circles and thin 
solid line are for data from Song and Helmberger (1993); squares and thin 
dashed line are for data from Creager (1992). The thin solid and dashed 
lines are the smoothed fits to the residuals. The heavy solid and dashed 
lines are the predicted residuals for the transverse isotropy expected if the 
inner core were composed of iron in either the hep and the fee structures. 
The similarity between the hep curve and the data support hep as the 
crystal phase for the inner core. (Stixrude and Cohen, 1995. Science , 267, 
1972-5, copyright 1995 American Association for the Advancement of 
Science.) 


face-centered-cubic (fee) structures. The hep structure of iron, 
aligned along the earth’s rotation axis, does a good job of 
modeling the observations. 

Inner core anisotropy is also shown by normal modes that 
have significant displacement in the inner core. If there were no 
lateral heterogeneity or anisotropy, the various singlets making 
up a normal mode multiplet would have almost identical 
eigenfrequencies (Section 2.9). In fact, as shown in Fig. 3.6-13 
for the j5 4 multiplet, the modes are split, so the eigenfre¬ 
quencies for the different singlets (points) vary depending on 
the azimuthal order. The dashed line (left) shows the split¬ 
ting predicted from a transversely isotropic model with elastic 
parameters (shown on the right). Here a, , ft and y are combina¬ 
tions of the elastic constants for transverse isotropy (Eqn 5). 
The velocity perturbation for any direction through the inner 
core is 

8viv = (2P-y) cos 2 <ft ( 15 ) 

where ft is the angle between the ray path and the earth’s rota¬ 
tion axis. 8vfv is zero along an equatorial path, but is about 1 % 
parallel to the axis. 

Inner core anisotropy is not perfectly symmetric about the 
rotation axis, which allows for the possibility of observing differ¬ 
ential rotation of the inner core with respect to the mantle. This 
phenomenon has been reported, seen as temporal variations of 
the BC-DF residuals (Eqn 14) for similar earthquake-station 
geometries. Quantification of such differential rotation and its 
implications for the generation of the magnetic field in the con¬ 
verting outer core are active research areas. 
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Fig. 3.6-13 Evidence for transverse isotropy in the inner core from the splitting of a spheroidal normal mode multiplet. Points represent the observed 
frequencies for the different azimuthal orders {left). The dashed curve is the prediction for the inner core of a model shown at the right by combinations of 
the elastic constants for transverse isotropy. In this formulation, a, ft, and J differ from their usual seismological definitions. (Tromp, 1993. Reproduced 
with permission from Nature.) 


3.7 Attenuation and anelasticity 

3.7 .1 Wave attenuation 

In the last section, we extended our view of the earth as an iso¬ 
tropic elastic medium to include the effects of anisotropy. We 
now consider anelasticity , or deviation from elasticity, which 
is one of the reasons why seismic waves attenuate or decrease 
in amplitude as they propagate. We have already discussed 
how the reflection and transmission of seismic waves at discrete 
interfaces reduce their amplitudes. Here, we consider four 
other processes that can reduce wave amplitudes: geometric 
spreading , scattering, multipathing , and anelasticity. The first 
three are elastic processes, in which the energy in the propagat¬ 
ing wave field is conserved. By contrast, anelasticity, sometimes 
called intrinsic attenuation , involves conversion of seismic 
energy to heat. 

As in many seismological applications, it is worth first con¬ 
sidering familiar analogous behaviors for light. As you move 
away from a street lamp at night, the light appears dimmer for 
several reasons. The first is geometric spreading: light moves 
outward from the lamp in expanding spherical wave fronts 
(Section 2.4.3). By the conservation of energy, the energy in a 
unit area of the growing wave front decreases as r~ 2 , where r is 
the radius of the sphere or distance from the lamp. 

Second, the light dims as it is scattered by air molecules, dust, 
and water in the air. As we have discussed, scattering results 
when objects acting as Huygens’ sources scatter energy in all 
directions. This effect is dramatic on a foggy night because the 
scattered light causes a halo around the lamp. 


Third, the light is focused or defocused by changes in the 
refractive properties of the air. 1 This effect is termed multi¬ 
pathing in seismology. Focusing and defocusing can be illus¬ 
trated by looking at the street light through binoculars. 
Looking through binoculars the usual way, the waves are 
focused by the lenses, and the lamp appears closer and brighter. 
Reversing the binoculars makes the lamp appear further away 
and dimmer. 

Fourth, some of the light energy is absorbed by the air and 
converted to heat. This process differs from the other three in 
that light energy is actually lost, not just moved onto a different 
path. 

All four processes are important for seismic waves. The first 
three are described by elastic wave theory, and can increase or 
decrease an arrival’s amplitude by shifting energy within the 
wave field. By contrast, anelasticity reduces wave amplitudes 
only because energy is lost from the elastic waves. So much 
of seismology is built upon the approximation that the earth 
responds elastically during seismic propagation that it is easy 
to forget that the earth is not perfectly elastic. However, 
without anelasticity, seismic waves from every earthquake that 
ever occurred would still be reverberating until the accumulat¬ 
ing reverberations shattered the earth. Elasticity is a good 
approximation for the earth’s response to seismic waves, but 


1 This process causes mirages, where light is refracted differently by hot air just 
above the ground. Similarly the distorted appearance of the setting sun results from 
seeing different parts of it through different levels of the atmosphere which refract 
light differently because of the vertical density gradient. 
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Fig. 3.7-1 Regional variations in attenuation seen in seismograms from 
an April 14,1995, earthquake in Texas recorded in Nevada (MNV, 

A= 15°) and Missouri (MM 18, A = 14°). The MNV record has less high 
frequency energy because the tectonically active western USA is more 
attenuating than the stable mid-continent. 

there are many important implications and applications of 
anelasticity. 

Anelasticity results because the kinetic energy of elastic 
wave motion is lost to heat by permanent deformation of the 
medium. The large-scale, or macroscopic, term for this process 
is internal friction. Among the smaller-scale, or microscopic, 
mechanisms that may cause this dissipation are stress-induced 
migration of defects in minerals, frictional sliding on crystal 
grain boundaries, vibration of dislocations, and the flow of 
hydrous fluids or magma through grain boundaries. Theoret¬ 
ical and experimental work is being carried out to examine 
possible mechanisms of seismic attenuation. 

The study of anelasticity has lagged behind that of the 
elastic wave velocities because of the complexities involved in 
measuring attenuation and understanding its physical causes. 
Although measuring seismic wave amplitudes is straightfor¬ 
ward, they depend on both the source, which is not perfectly 
known, and all the elastic and anelastic effects anywhere along 
the paths that the seismic energy traveled between the source 
and the receiver. Hence it can be hard to distinguish the effects 
of anelasticity from elastic processes. 

This inherent uncertainty is somewhat compensated by the 
fact that variations in anelasticity are large, as illustrated by 
comparison of records of an earthquake in Texas at stations in 
Nevada and Missouri (Fig. 3.7-1). The Nevada seismogram 
has much less high-frequency energy, showing that the crust 
in the western USA is much more attenuating than that in the 
Midwest. By comparison, seismic velocity variations between 
these areas are generally less than ±10%. Even so, because of 
the difficulties in measuring attenuation, variations in attenua- 
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Fig. 3.7-2 Schematic representation of the variations of seismic 
attenuation (top) and normalized velocity ( bottom) as a function of 
normalized temperature changes. Attenuation is more sensitive to 
increased temperature. (Romanowicz, 1995./. Geophys . Res., 100 , 
12,375-94, copyright by the American Geophysical Union.) 


tion at both regional and global scales are much less resolved 
than similar variations in velocity. 

Attenuation is valuable for studying temperature variations 
within the earth. Many important geophysical processes (mantle 
convection, plate tectonics, magmatism, etc.) involve lateral 
variations in temperature. Elastic velocities are also sensitive to 
temperature, but are better for mapping cold (fast) anomalies 
like subducting slabs than hot (slow) material like that at 
midocean ridges (Section 2.5.10). As shown in Fig. 3.7-2, 
seismic velocities depend nearly linearly upon temperature, 
whereas attenuation depends exponentially on temperature. 
Thus combining velocity and attenuation studies can provide 
valuable information. Figure 3.7-3 shows the velocity and 
attenuation structure at a portion of the East Pacific rise axis, 
where a low-velocity, high-attenuation region is interpreted as 
a melt-filled magma chamber. 
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Fig. 3.7-3 Results of P-wave velocity (top) and attenuation ( bottom ) 
tomography across the axis of the East Pacific rise. (Solomon and Toomey, 
1992, reproduced with the permission of Annual Reviews Inc.) 


3.7.2 Geometric spreading 

The most obvious effect causing seismic wave amplitudes to 
vary with distance is geometric spreading, in which the energy 
per unit wave front varies as a wave front expands or contracts. 
Geometric spreading differs for surface and body waves. For 
a homogeneous elastic spherical earth, a surface wave front 
would spread as it moved from the source to a distance 90° 
away, refocus as it approached the antipode on the other side 
of the earth from the source, and so on. The amplitudes would 
be largest at the source and antipode, where all the energy 
would be concentrated, and smallest halfway between, 90° 
from the source. On a homogeneous flat earth, the surface 
waves would spread out in a growing ring with circumference 
27cr, where r is the distance from the source. Conservation of 
energy 2 requires that the energy per unit wave front decrease 
as 1/r, whereas the amplitudes, which are proportional to the 
square root of energy (Eqn 2.4.65), decrease as l/y/r. However, 
because the earth is a sphere, the ring wraps around the globe 
(Fig. 3.7-4), making the energy per unit wavefront vary as 

1/r = l/(a sin A), (1) 

2 As we saw in discussing wave reflection and transmission (Section 2.2.4), ampli¬ 
tudes are easier to visualize, but energy is conserved, and hence often more useful for 
understanding wave behavior. 


Source 



Fig. 3.7-4 Geometric spreading of surface waves for a laterally 
homogeneous earth yields a wave front that is a ring whose 
circumference varies as a sin A. 

where A is the angular distance from the source. Thus the 
amplitudes decrease as (a sin A)~ 1/2 , with minimum at A = 90°, 
and maxima at 0° and 180°. Actually, not all the energy would 
focus at the antipode and source even if the earth had no lateral 
variations in velocity, because some defocusing would result 
from the earth’s ellipsoidal shape. Lateral heterogeneity, dis¬ 
cussed next, further distorts the wavefront. 

For body waves, consider a spherical wavefront moving 
away from a deep earthquake. Energy is conserved on the ex¬ 
panding spherical wavefront whose area is 4;rr 2 , where r is the 
radius of the wavefront. Thus the energy per unit wave front 
decays as 1/r 2 , and the amplitude decreases as 1/r. In reality, 
because body waves travel through an inhomogeneous earth, 
their amplitude depends on the focusing and defocusing of rays 
by the velocity structure. The effects of the variations in veloc¬ 
ity with depth were shown in Section 3.4 by considering the 
density of rays with different incidence angles that arrive at 
a given distance. These amplitude variations are viewed as 
geometric spreading and described by the second derivative of 
the travel time curve (Eqn 3.4.20). Thus, although the phenom¬ 
enon of geometric spreading is intuitive, quantification of its 
effects is complicated. 

3.7.3 Multipathing 

Seismic waves are also focused and defocused by lateral varia¬ 
tions in velocity. Although physically this process is the same 
as the effects of vertical variations, it is often distinguished by 
the term multipathing , The distinction reflects our view of the 
earth as an essentially layered planet with secondary lateral 
variations. 

As we discussed for tsunamis (Fig. 2.8-9), seismic waves 
refract towards low-velocity anomalies and away from high- 
velocity anomalies. Figure 3.7-5 illustrates this effect for a plane 
wave passing through a refracting layer of variable thickness. 








188 Seismology and Earth Structure 




Incoming 

wave Lens Amplitudes 



Fig. 3.7-5 An example of how velocity heterogeneities affect wave 
amplitudes. A plane wave impinging from the left is refracted by a layer 
of variable thickness. The amplitudes of the waves arriving at the right 
are shown. Regions of wide ray spacing have low amplitudes, and dense 
spacing yields large amplitudes. Concentrated lines, or caustics, cause very 
high amplitudes. (Hannay, 1986. Reproduced with permission from the 
Institute of Mathematics.) 



Fig. 3.7-6 Schematic example of how velocity heterogeneity can cause 
an erroneous estimate of either the focal mechanism or attenuation. 

The figure-eight structure at the earthquake shows the amplitude of a 
radiated surface wave as a function of azimuth, which depends on the 
focal mechanism (in this case dip-slip motion on a vertical fault). The 
predicted path would leave the source with a lower amplitude than the 
actual path, which is bent by the high-velocity region. Hence a focal 
mechanism study using these data without accounting for the perturbed 
ray path would be incorrect. Conversely, modeling the amplitudes without 
considering the high-velocity region would yield too-low estimates of 
attenuation. 

The ray paths, which are normal to the local wave front, show 
how the initially planar wave is refracted. The ray spacing 
represents the energy density, so amplitudes are low where the 
rays are far apart, and high where they are close together. In 
some cases the energy focuses into caustics, areas of infinitely 
high energy density, which appear as solid black regions. 

This example illustrates that velocity variations can affect 
the amplitudes of seismic waves some distance away. For ex¬ 
ample, small velocity heterogeneities near an earthquake can 
cause large amplitude variations at teleseismic distances. This 
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Fig. 3.7-7 Numerical simulation of the paths taken by seismic energy 
associated with the body wave phases S and sS for a 120 km-deep 
earthquake. The values shown, computed using normal modes, show 
the sensitivity of the travel time to velocity perturbations. These phases 
sample the structure In a banana-shaped region shown in side view {left) 
and end-on {right) surrounding the geometric ray path (solid line). 
(Zhao et aL, 2000.) 


effect can be important, because most earthquakes occur at plate 
boundaries, such as subduction zones or mid-ocean ridges, 
where there are significant velocity heterogeneities. This phe¬ 
nomenon can cause difficulties in the interpretation of seismic 
data. For example, assume (Fig. 3.7-6) that the actual wave 
path from an earthquake to a receiver differs from that pre¬ 
dicted due to a region of anomalously fast velocities. If the 
amplitudes of these waves were used to study the earthquake’s 
focal mechanism, the result would be biased because the waves 
left the source in a direction different from that expected if the 
velocity heterogeneity were not present. Conversely, if the focal 
mechanism were known, the observed amplitude would differ 
from that expected, so an estimate of the attenuation would be 
incorrect. 

When multipathing occurs, the seismic waves arriving at a 
receiver can be viewed as having taken some ray paths in addi¬ 
tion to the direct path, and so have sampled a larger region of 
the earth. A way to view this is that Fermat’s principle giving 
the geometric ray path applies exactly only to waves of infinite 
frequency. For waves of finite frequency, we can view the seismic 
waveform as a coherent sum of energy that travels all possible 
paths that arrive within a half-period of the infinite-frequency 
wave, which took the shortest time. These paths form a volume 
called the first Fresnel zone around the infinite-frequency path. 
Successive half-periods correspond to higher-order Fresnel 
zones. For longer-period waves, the maximum time over which 
energy arrives coherently is longer, so the Fresnel zones are 
proportionately larger. For example, teleseismic body waves 
sample a banana-shaped region about the geometric ray path. 
Figure 3.7-7 shows Fresnel zones for a body wave phase in a 
laterally homogeneous earth, plotted in terms of how the travel 
time is affected by velocity perturbations. The curved ray path 
represents the effects of vertical variations in velocity on the 
infinite frequency ray, and the surrounding “banana” represents 
the effects of finite-frequency waves. Lateral heterogeneity 
would distort the “banana.” 
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Fig. 3.7-8 Schematic representation of different approaches to seismic 
wave propagation in a medium with velocity heterogeneity. The approach 
depends on the ratio of the heterogeneity size a to the wavelength X and 
the distance L the wave travels through the heterogeneous region. (After 
Aki and Richards, 1980. From Quantitative Seismology, © 1980 by 
W. H. Freeman and Company, used with permission.) 


3.7.4 Scattering 

A related effect to multipathing is the scattering of seismic 
waves. Both effects are complicated, and the distinction be¬ 
tween them is gradational. As shown in Fig. 3.7-8, whether 
the effects of velocity heterogeneity are regarded as scattering 
depends on the ratio of the heterogeneity size to the wavelength 
and the distance the wave travels through the heterogeneous 
region. When the heterogeneity is large compared to the wave¬ 
length, we regard the wave as following a distinct ray path that 
is distorted by multipathing. When the velocity heterogene¬ 
ities are closer in size to the wavelength, we think of scattered 
energy rather than distinct ray paths. However, when the 
heterogeneities are much smaller than the wavelength, they 
simply change the medium’s overall properties. The further the 
wave travels in the heterogeneous region, the more useful the 
scattering description becomes. Hence for longer distances, 
the wavelength range viewed as scattering increases. 3 

Figure 3.7-8 also illustrates that diffraction can be viewed 
as behavior intermediate between scattering and multipathing. 

3 The fact that light scattering in the atmosphere depends on wavelength and the dis¬ 
tance traveled has familiar consequences. Because the shortest wavelengths of visible 
light are the most scattered, blue light reaching us from all directions makes the sky 
appear blue. The loss of blue light makes the sun appear yellow, although it would ap¬ 
pear white if observed from a spacecraft. At sunset, when the sunlight passes through 
a longer path in the atmosphere than at other hours, intermediate wavelengths are 
also scattered, leaving direct light from the sun enhanced in the longest visible wave¬ 
lengths (red light) and making the sun appear red. 


As we have seen, some of the behavior of diffracted waves 
can be derived either using a Huygens’ source scattering repres¬ 
entation (Section 2.5.10) or by using ray paths in a medium 
with variable velocity, as for the head wave (Section 3.2.1) or 
core diffraction (Section 3.5.2). These ray paths were not truly 
geometric, in that energy was required to follow paths that 
did not obey Snell’s law. The distinction between ray theory 
and diffraction depends on wavelength, as discussed in Section 
2.5.10, so waves diffracted around the core are depleted in the 
higher frequencies. 4 

Scattering can be viewed in different ways. In some situ¬ 
ations we view the scattering as deterministic, and try to image 
distinct scatterers. For example, migration methods in reflec¬ 
tion seismology (Section 3.3.7) seek to undo the effects of scat¬ 
tering and produce a clearer image of the subsurface. In other 
situations, we view the medium as containing many scatterers 
and consider their effects on the wave field statistically. This 
approach is taken to the scattering of PKP waves (Fig. 3.5-8), 
with a wavelength of about 10 km, by lower mantle hetero¬ 
geneities of about that size. 

Scattering is especially important in the continental crust, 
which has many small layers and reflectors resulting from 
billions of years of continental evolution. Although these 
structures do not significantly affect waves with wavelengths 
longer than tens of km, for shorter-wavelength waves they can 
act as point scatterers or Huygens’ sources. Hence some of the 
scattered energy arrives at a receiver after the initial pulse 
that obeyed Fermat’s principle and took the shortest path. 
This scattered energy causes an arrival to have a coda , a tail of 
incoherent energy that decays over a duration of seconds or 
minutes. The main arrival has a polarity related to the direction 
of propagation that can be observed on a three-component 
seismometer by forming particle motion plots (Fig. 2.7-6). By 
contrast, the scattered energy arrives from various directions 
and thus shows little or no preferred particle motion. 

Figure 3.7-9 demonstrates the scattering for a seismic arrival. 
The unscattered wave travels the shortest distance and gives the 
initial arrival {left). Scattered energy lost from this arrival that 
instead arrives later could have been scattered from an infinite 
number of locations that would yield the observed travel time. 
In a constant-velocity medium, the locus of these possible scat¬ 
terers forms an ellipsoid with the source and the receiver as 
foci {center). Larger ellipsoids define the possible scatterers for 
energy that arrives later {right). These ellipsoids are distorted 
by velocity heterogeneity and are analogous to the Fresnel 
volume used when we consider the waves as following distinct 
ray paths. 

Scattering is especially noticeable on the moon. Figure 3.7-10 
contrasts seismic records of an earthquake and the impact of 
a rocket on the moon. Most of the earthquake’s energy arrives 
in the main P- and S-wave arrivals. By contrast, on the moon 

4 This effect makes it hard to understand what someone is saying when they are 
standing around a corner, because the voice sounds muffled due to the loss of the 
higher frequencies. 
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Fig. 3.7-9 Development of a P -wave coda 
due to scattering. Left: The first arrival 
follows the minimum-time path from the 
earthquake (EQ) to the station (STA) 
according to Fermat’s principle, and 
involves no scattered energy. Center: 
Scattered energy arrives after the first 
arrival. An infinite number of possible 
locations for scatterers yield arrivals at this 
same time. In a homogeneous medium the 
locus of these points forms an ellipsoidal 
surface. Right: Energy arriving later in the 
coda can be modeled as arising from a larger 
ellipsoidal surface of possible scatterers. 
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Fig. 3.7-10 Comparison of seismograms for the earth and the moon. 
Top: Seismogram recorded at Cathedral Cave, Missouri (CCM), from a 
small earthquake 183 km away. Bottom: Seismogram recorded by the 
Apollo 12 seismometer of the impact of the Apollo 14 Saturn booster 
rocket 147 km away. The terrestrial record shows high attenuation, 
whereas the lunar seismogram shows intense scattering due to the 
fractured regolith and very weak attenuation due to the lack of 
intergranular water. (Mitchell, 1995. Rev . Geophys., 33, 441-62, 
copyright by the American Geophysical Union.) 


the energy is intensely scattered, and no main arrivals can be 
identified. This is probably because intrinsic attenuation is 
much larger in the earth’s crust than on the moon. The move¬ 
ments of interstitial fluids in the earth’s crust greatly reduce 
seismic wave amplitudes, whereas energy scattered by the 
moon’s highly fractured near-surface regolith layer is poorly 
absorbed and reverberates. As a result, efforts to identify seis¬ 
mic phases and use them to study the moon’s internal structure 
have been generally unsuccessful. 


3. 7.5 Intrinsic attenuation 

We can gain insight into the intrinsic attenuation of seismic 
waves by examining a simple system, a damped harmonic oscil¬ 
lator composed of a spring and a dashpot. We use Newton’s 
second law, F = ma, to describe the displacement u(t) of a mass 
m. The restoring force of the spring is proportional to minus 
the spring constant k times the spring extension or displace¬ 
ment from the equilibrium positions, so 

m ~ .. . + ku{t) = 0. (2) 

dt 2 

Once set in motion by an impulse, this frictionless system has 
a purely elastic response described by a perpetual harmonic 
oscillation 

u(t)=Ae i(0 ° t + Be ia >° t , (3) 

where A and B are constants, and the mass moves back and 
forth with a natural frequency 


JN ’ 

II 

O 

3 

(4) 

One example of this general solution is 


u(t)=A 0 cos (co 0 t). 

(5) 


Once the motion is started, this undamped oscillation con¬ 
tinues forever, because no energy is lost. FFowever, this is no 
longer the case if the system contains a dashpot, or damping 
term. The damping force is proportional to the velocity of the 
mass and opposes its motion. Hence the equation of motion 
(Eqn 2) becomes 


m- 


d 2 u(t) 
dt 2 


+ ym 


du(t) 

dt 


+ k u(t) — 0, 


( 6 ) 


where yis the damping factor. To simplify this, we define the 
quality factor 


"/n/WA - 
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Q = £o 0 /y, (7) 

and rewrite Eqn 6 as 


d 2 u{t) 

dt 2 


<» 0 du{t) 
Q dt 


+ q)q u(t) = 0. 


( 8 ) 


This differential equation, which describes the damped har¬ 
monic oscillator, can be solved assuming that the displacement 
is the real part of a complex exponential 


u(t) = A 0 e i P t 7 (9) 

where p is a complex number. Substituting Eqn 9 into Eqn 8 
yields 


(~p 2 + ipco 0 /Q + o) 2 ) = 0. (10) 

For this to be satisfied for all values of t, 

-p 2 + ipco 0 /Q + cOq = 0. (11) 

Because p is complex, we break it into its real and imaginary 
parts, 

p = a + ib , p 2 = a 2 + 2iab-b 2 , (12) 

so Eqn 11 gives 

-a 2 - liab + b 2 + iao 0 /Q - bco 0 IQ + - 0, (13) 

which can be split into equations for the real and imaginary 
parts and solved separately: 

Real: -a 2 + b 2 - bco 0 /Q + &>q = 0, (14) 

Imaginary: -lab + aco 0 /Q — 0. 

Solving the imaginary part for b gives 
b = co 0 !2Q , (15) 

and putting this into the equation for the real part gives 
a 2 = C 0 q- col/4Q 2 = col{l-l/4Q 2 ). (16) 

Thus we define 

co-a = co 0 {l - 1/4Q 2 ) 172 , (17) 

and rewrite Eqn 9 with separate real and imaginary parts, 
u{t) = A 0 e i{m+ibt) = A Q e~ bt e i(0t . (18) 


The real part is the solution for the damped harmonic 
displacement, 



Fig. 3.7-11 For a damped harmonic oscillator, the envelope (dashed lines) 
amplitude is initially A 0 , but decays with time at a rate determined by the 
quality factor, Q. 


u(t) = A 0 e ®7 /2 Q cos {cot). (19) 

This solution shows how the damped oscillator responds 
to an impulse at time zero (Fig. 3.7-11). It is no longer a simple 
harmonic oscillation because it differs in two ways from the 
undamped solution (Eqn 5). The exponential term expresses 
the decay of the signal’s envelope, or overall amplitude, 

A{t) = A Q e^ t!2 Q, ( 20 ) 

which is superimposed on the harmonic oscillation given by the 
cosine term. Moreover, the frequency of the harmonic oscilla¬ 
tion (Eqn 17) is changed from the natural frequency of the 
undamped system, co 0 , by an amount depending on the quality 
factor. Q is inversely proportional to the damping factor, y, so 
the smaller the damping, the greater Q is. For no damping, Q is 
infinite, and the damped solution reduces to the undamped 
one, because its amplitude does not decay with time (Eqn 20), 
and its frequency remains w 0 (Eqn 17). As the damping in¬ 
creases, Q decreases, so the amplitude decays faster, and the 
frequency changes more from its undamped value. Equation 20 
shows that the amplitude decays to e~ x (0.37) of its original 
value by the relaxation time 

t lle = 2Qlo) 0 . (21) 

Because the energy in an oscillating system is proportional to 
the square of the amplitude, as we saw for a harmonic wave in 
Section 2.2.4, Eqn 20 gives the energy of the oscillator as 

E{t)= jkA(t) 2 = (22) 
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Energy decays faster than the amplitude, because the negative 
exponent in Eqn 22 is twice as large as in Eqn 21. 

3.7.6 Quality factor, Q 

The solution for the damped harmonic oscillator incorporated 
the damping through the quality factor, Q. Attenuation for 
seismic waves and a variety of other physical phenomena are 
often discussed in terms of Q or Q" 1 . Although Q has more 
convenient values, Q" 1 has the advantage that it is directly, 
rather than inversely, proportional to the damping. In some 
cases, Q is used to describe the decay of an oscillation, whereas 
in others it is used to describe the physical properties of the 
system that cause a disturbance to attenuate. For example, 
the Q of one of the earth’s normal modes, which is directly 
analogous to a damped oscillator, describes how the mode 
decays with time. This decay results from the distribution of 
material in the earth that causes seismic energy to be lost 
to heat. This distribution can be described in terms of a Q, 
or anelastic attenuation, structure analogous to the elastic 
velocity structure. 

As a result, we speak of the Q of surface waves, body waves, 
and crustal phases like Lg. We also speak of the variation 
within the earth of Q a and Qp, which controls the attenuation 
of P and S waves. The anelastic structure of the earth, given by 
variations in Q a and Qp, is analogous to the elastic velocity 
structure because Q can be viewed mathematically as an 
imaginary part of the velocity. To see this, note that (9), which 
we used to derive the decaying oscillation, can be viewed as an 
oscillation with a complex frequency p 

u{t)=A 0 e i P t = A 0 e^ a + i ^ t (23) 

where the real and imaginary parts of the frequency are 

a = co b = (O ' = cd 0 /2Q ~ co/2Q (24) 

assuming that attenuation is small (Q large) enough that co~ 
co 0 . Hence we write 

Qr 1 = 2bla = 2ard(o. (25) 

Treating the attenuation as an imaginary part of the 
frequency and dividing by the wavenumber lets us treat the 
corresponding velocity for a propagating wave as complex, 

c + ic* = co/k + ico"'Ik = calk + icoQ~ 2 /2k (26) 

so 

Q~ l = 2c* l c. (27) 

Thus we can express the attenuation of P- and 5-waves by using 
the quality factors Q a and Qp to give imaginary parts to the 
velocities. If there is no attenuation (Q = «) the frequency and 


the velocity have no imaginary parts. This formulation is useful 
because it means that methods used to invert surface wave 
velocities or normal mode eigenfrequencies to find velocity 
in the earth can also be used to invert observations of their 
attenuation to find the distribution of anelasticity. 

We pose the complex parts of the velocities in terms of the 
properties of the material causing attenuation by treating the 
elastic moduli as having imaginary parts. For the shear velocity 

j8+»j8*=/}(l + iGjV2) 

= ((fi+iH*)/p) 1/2 = 0(l + i/i*lju) m 

~P(l+iH*/2n) (28) 

where the last step used the first term of the Taylor series, 
because the attenuation and hence imaginary part is small. 
Comparing terms shows that 

Q'/=^. (29) 

A similar analysis shows that the quality factor for P waves is 
given by the imaginary parts of both the bulk and shear moduli 

Q-£=(K* +4/3pL*)/(K + 4/3n). (30) 

Physically, it is useful to think of energy as being lost in either 
compressional or shear deformation, so we express their 
attenuation in terms of imaginary parts of the compressibility 
and rigidity 

Q~i = K*/K Q~J = H*IH = Q~p- (31) 

These quality factors are related to those for the velocities by 

Qa = LQ~[f + (1 - L)Q~k L = (4/3)(p/a)\ (32) 

In general little energy is lost in compression, so Q~^ is very 
small, and thus most of the attenuation for P waves occurs in 
shear, making (4/9)Q~J. 

Techniques for measuring Q in the earth follow from those 
used to measure Q for the decay of an oscillation. From Eqn 
20, taking the natural logarithm of the envelope shows that 

In A(t) = ln A 0 - co 0 t/2Q, (33) 

so Q can be found from the slope of the logarithmic decay. 
Alternatively, if successive peaks one full period T = 2n/co 0 
apart have amplitudes 

A 1 (t 1 )=A 0 exp (-(Oq^/IQ) and 
A 2 (ti + T)=A 0 exp (-co 0 (t x + T)/2Q), (34) 

their ratio is 

A 1 /A 2 ~exp [-(d^IIQ - co Q (t 1 + T)I2Q] = exp (nlQ), (35) 
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Fig, 3.7-12 Frequency dependence of attenuation for seismic waves in 
the mantle. Q is shown as though all measurements were for ScS waves, 
a good measure of the average mantle value because of their path from 
surface to core and back. (Sipkin and Jordan, 1979. © Seismological 
Society of America. All rights reserved.) 


so 

Q = n/\n(A 1 /A 2 ). (36) 

To illustrate this approach, note that in Fig. 3.7-11 the second 
peak, at cot = 2 n, is about 2/3 of the first peak, at aot = 0. Thus 
Q ~ Jt/ln (3/2) * 8. This is small compared to Q for mantle 
rocks, which is in the range of 200-500, but comparable to 
that for some sedimentary rocks. For example, S waves in shale 
have Q ~ 10. 

Another way to view Q is as the number of cycles the oscilla¬ 
tion takes to decay to a certain level. The number of cycles n, is 

n = t/T=cotI2ji~ co 0 t/2jz, (37) 


replace t with x/c, where x is the distance traveled and c is the 
velocity. Thus Eqn 20 becomes 

-Q} 0 x 

A(x) = A Q e lcQ , (40) 

which describes how the amplitude decays with the distance 
the wave propagates. 

When these techniques are used to measure Q for seismic 
waves, we find that Q varies with frequency (Fig. 3.7-12). 
Q is essentially constant at low frequencies, about 0.001 to 
0.1 Hz, but then increases with frequency. Thus Q values 
derived from normal mode analysis are lower than those 
obtained from higher-frequency waves. Although our first 
instinct might be that Q should be frequency-independent, 
such a situation imposes a stringent requirement. Because 
Q = G)/y, constant Q requires a physical mechanism in the earth 
with damping proportional to frequency. We will explore this 
issue shortly. 

Before doing so, it also worth noting that our model of the 
damped oscillator assumes that the attenuation is linear , such 
that Q is independent of the amplitude of the wave. This is the 
same as assuming that the amplitudes are not too large. In most 
rocks this condition is satisfied if the strains involved with the 
wave propagation are less than about 10” 6 . Although this is 
true at teleseismic distances, it is not the case near an earth¬ 
quake or an explosion, where the elastic strain can exceed 
10" 4 . Large earthquakes can cause large strains, and hence a 
region of nonlinear attenuation. 

3.7,7 Spectral resonance peaks 

We are interested in understanding how anelasticity in the earth 
causes the attenuation of propagating waves. This behavior is 
an example of the general case of how a damped harmonic 
oscillator responds to a driving force that depends on frequency. 
To see this, we modify Eqn 8 by adding a harmonic driving 
force, and so have the inhomogeneous equation 


where the last approximation, based on Eqn 17, assumes 
that the attenuation is small enough (Q » 1) so that co ~ co 0 . 
The amplitude at time t n , after n cycles, is 


d 2 u du 

-+ y — 

dt 1 dt 


+ (OqU = e icot . 


(41) 


A(t„)-V S ’ 

so, if we define n as equal to Q, 

A(g»A 0 e-**0.04A 0 . 


The solution is found using a trial solution 
(38) 

u{t) = A{co)e i ^ co) e im . (42) 

Substituting this in Eqn 35 yields the amplitude response, A (o), 
' ' and phase response, (j){co). 


Thus, after Q cycles, the amplitude drops to a level of e~ n or 4% 
of the original amplitude. Hence, in Fig. 3.7-11, more than 
95% of the amplitude is lost after Q ~ 8 cycles. 

Q can describe the oscillation’s decay in either time or space. 
For standing waves like normal mode oscillations, Q describes 
the decay of amplitudes with time. For traveling waves, we 


A{co) = [(col “ co 2 ) 1 + {coy) 2 )” 112 , 


(}){co) = tan 1 


-yco 
col ~ 0)2 


.(43) 


As shown in Fig. 3.7-13, the amplitude and the phase 
responses depend on the damping factor y and how far the 
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Fig. 3.7-13 Amplitude (left) and phase 
(right) response of a forced, damped 
harmonic oscillator with natural frequency 
<% For greater damping (lower Q) the peak 
decreases and both it and phase shift are 
broadened from the sharp values they have 
with little damping. 


forcing frequency co is from the oscillator’s natural or resonant 
frequency, co 0 . The resonance curve shows how the damped 
harmonic oscillator responds to the frequency-dependent 
driving force. The closer the driving frequency co is to the 
oscillator’s natural frequency co 0 , the more the oscillator 
responds. 

The resonance curve can be viewed in terms of the frequency 
at which the peak occurs 

% = (®0 - r 2 /2) 1/2 = %(1 - 1/2 Q 2 ) V2 (44) 

and the amplitude of the peak 

M® p ) = QI(c0q(1 - 1/4Q 2 ) 172 ). (45) 

If the oscillator is undamped (y = 0, Q = oo) the peak occurs at 
its natural frequency and shows an infinite response. Adding 
damping lowers the amplitude of the peak and shifts it. 
However, the shift is very small unless the system is much more 
damped (Q < 2) than occurs for seismic wave attenuation. The 
damping also spreads out the peak in frequency, so the more 
the damping, the broader and lower the peak. To see why, 
recall that the more the damping, the faster the oscillation 
decays as a function of time (Fig. 3.7-11). As we will see in 
Chapter 6, the spectrum of an undamped sinusoid is a sharp 
line, or delta function, so additional frequencies, and thus a 
broader peak, correspond to the decaying sinusoid. The phase 
response also has significance, as we will see when we discuss 
seismometers (Section 6.6). 

The resonance curve concept appears in a wide variety of ap¬ 
plications, because many physical systems can be viewed as 
damped harmonic oscillators. Three commonly considered in 
seismology are the attenuation of the earth’s normal modes, the 
behavior of a seismometer, and the response of a building to 
ground motion. An earthquake puts energy at various frequen¬ 
cies into the earth, exciting its normal modes (Section 2.9). 
These modes form a set of damped harmonic oscillators, so the 


amplitude spectrum of a long-period seismogram contains 
peaks that correspond to the net resonance curve for each mode 
multiples The width of a peak depends on the frequencies and 
amplitudes of the mode’s singlets and the mode’s damping. 
Seismometers can also be viewed as damped harmonic oscil¬ 
lators, whose natural frequency and damping control their 
response to ground motion. In addition, as mentioned in 
Section 1.2.2, buildings can be considered damped harmonic 
oscillators. This concept is important in designing earthquake- 
resistant structures, because buildings are most vulnerable to 
ground motion with frequencies close to their natural frequen¬ 
cies, so damping is added to reduce the resulting motion. 

3. 7 .8 Physical dispersion due to anelasticity 

An important consequence of seismic wave attenuation is 
physical dispersion , in which waves at different frequencies 
travel at different velocities. This differs from the geometrical 
dispersion discussed in Sections 2.7 and 2.8, in which surface 
waves of different frequencies have different apparent velo¬ 
cities at the surface because they sample different depths and 
hence encounter material of different velocities. Thus, although 
the intrinsic velocity of the rock at any depth is treated as 
frequency-independent, dispersion occurs because of the depth- 
variable velocity of the material. By contrast, with physical 
dispersion the intrinsic velocity of waves in the medium varies 
with frequency. 5 

To see how physical dispersion results from attenuation, 
consider how a seismic wave changes shape. Assume that a 
delta function wave, a pulse of infinite height and unit area 
(Fig. 3.7-14), propagates through a homogeneous elastic 
medium with intrinsic velocity c: 

5 The rainbow results from physical dispersion for light waves passing through 
water drops in the atmosphere or a prism. Different frequencies (colors) of light travel 
at different speeds through the water or prism, and thus refract at different angles, 
separating initially white light into different colors. 







Fig. 3.7-14 Left-. A propagating wave pulse 
composed of a delta function. With no 
dispersion, all frequencies arrive at the same 
time. Center : The delta function after 
broadening by attenuation, showing that 
energy arrives before the high-frequency 
arrival time. Right : The pulse including 
physical dispersion, which makes the lower- 
frequency waves travel more slowly, so that 
they do not arrive before the highest- 
frequency component. 



t = x/c 
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f = x/c t = x!c„ 


u{x,t) = S{t-x/c). (46) 

The Fourier transform of the delta function, 

* r 

F{co)= u{x,t)e~ i(0t dt = 8{t-xlc)e~ mt dt-e^ mxlc \ (47) 

J 

shows that the delta function is made up of waves of all fre¬ 
quencies, as we discuss further in Section 6.2,5. If there is 
no dispersion, all the frequencies travel at the same speed and 
arrive at the same time. The effect of attenuation as a func¬ 
tion of distance is given by writing Eqn 34 as a function of 
frequency, 

-cox 

A{co) = e 2cQ , (48) 

which shows that if Q is constant, the rate at which the ampli¬ 
tude decays with distance increases strongly with frequency. 
To see how this attenuation affects the delta function wave, 
we multiply Eqn 47 by Eqn 48 and use the inverse Fourier 
transform to return to the time domain 

-if -i ~ 6)x —icox 

u{x, t) —— A{cd)F{(d)e mt do )=— e 2cQ e c e 10)t dco. (49) 
In } 2n j 

Evaluating the integral yields 

u(x, t) = [{x/2cQ)l{{xl2cQ) 1 + {x/c - t) 2 )]/n, (50) 

so the delta function is broadened by attenuation into a 
wavelet that is symmetric in time about its maximum at t-x/c 
(Fig. 3.7-14, center). 

A problem with this solution is that seismic energy arrives 
before the geometric arrival time of the delta function pulse, 


t = x/c, which is the arrival time of the infinite-frequency 
component. In fact, because the tails of the wavelet extend 
to infinity on both sides of t = x/c, some energy arrives before 
the earthquake occurred. This impossible situation, called 
noncausality , results from the fact that attenuation broad¬ 
ened the pulse by preferentially removing the high-frequency 
components. 6 

Thus the physical mechanisms that cause attenuation in the 
earth must prevent waves of all frequencies from traveling at 
the same speed. Instead, there must be dispersion, where the 
lower frequencies causing the tails travel more slowly and 
arrive later. We saw in Section 2.8 that in a dispersive medium 
we distinguish the phase velocity c, the speed of a wave of a single 
frequency, from the group velocity that describes the speed of a 
wave group. Thus the mathematical condition for causality 
is that u(x, t) = 0 for all t < x/c^, where c M = c(oo) is the phase 
velocity of the infinite-frequency waves that arrive first. One 
such dispersion relation for phase velocity as a function of 
frequency, called Azimi’s attenuation law, is 

f V 

c(co) = c 0 1 + — In — , (51) 

nQ a> 0 J 


where c 0 is a reference velocity corresponding to a reference 
frequency co 0 7 This relation provides the needed causality, 
because the resulting pulse (Fig. 3.7-14, right ) has high fre¬ 
quencies arriving at or soon after t = x/c M , whereas the low fre¬ 
quencies arrive later over a duration depending on the value 
of Q. If there is no attenuation {Q = »), Eqn 46 yields no dis¬ 
persion, and the delta function is not broadened. 

From Eqn 51, the P- and 5-wave velocities a and j3 vary as a 
function of period T, as 


6 We noted a similar effect in Section 2.9.8: namely, that individual normal modes of 
a single frequency appear to predict displacement before a wave could arrive, but their 
sum gives a wave at the correct time. 

7 Aki and Richards (1980). 
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where a(l) and /3( 1) are the velocities at 1 s. We find how a 
wave’s travel time varies with period by integrating along its 
ray path (Eqn 3.4.16). The effect can be significant. For a vert¬ 
ical ScS wave, the travel time for T = 40 s is 5 s slower than for 
T= Is. Out of a total travel time of about 934 s, this is a differ¬ 
ence of 0.5%. For vertical PcP waves at the same periods, the 
travel time difference is 1 s out of 511 s, or 0.2%. 

This phenomenon causes a discrepancy between the seismic 
velocity structure found by inverting observations of long- 
period normal modes and short-period body waves. The 
velocities inferred from normal modes are consistently slower 
than those from body waves. The discrepancy reflects the fact 
that attenuation causes longer-period waves that are studied 
as normal modes to travel at lower velocities than the body 
waves. Failure to take this effect into account can cause 
errors in the predicted arrival times of body waves of several 
seconds. 

The pulse in Fig. 3.7-14 (right) is also known as an attenua¬ 
tion operator , and can be used to model the effects of attenua¬ 
tion on seismic waveforms. As discussed in Section 3.3.6 and 
derived in Section 6.3, seismic signals can be modeled by 
convolving the source-time function with operators describing 
different effects. Thus a synthetic seismogram computed for an 
elastic earth can be convolved with the attenuation operator 
to create a more realistic pulse. 

Body wave attenuation is often characterized using the para¬ 
meter t *. If a ray travels through a region of constant Q, 


fr _ t „ travel time 
Q quality factor 


(53) 


Because Q varies within the earth, we derive t * by integrating 
along the ray path, 



cot 


Fig. 3.7-15 Top : Schematic diagram of a standard linear solid, made up 
of a mass connected to two springs and a dashpot. This system responds 
elastically to waves with periods that are short compared to the relaxation 
time, T, and viscously for periods longer than the relaxation time. Center : 
Absorption peak for this material. Q _1 approaches zero for large and small 
values of cot and is greatest for cox-1. Bottom: Phase velocity dispersion 
resulting from the attenuation. The velocity is c 0 at low frequencies and 
increases to at high frequencies. 


S waves from deep earthquakes that only cross the astheno- 
sphere once have lower t* than S waves from shallow events. 


t* = 


dt _ y A t t 

e 


(54) 


where A t { and Q ■ are the travel time and Q values on the f th path 
segment. For P waves, t* is often about 1 s, whereas S waves 
typically have t'f around 4 s. The values of t * increase with 
increased distance, but are also affected by the number of 
passages through the asthenosphere (about 80-220 km depth). 
For example, ScS tends to have a higher t * (greater attenuation) 
than S at the same distance because of the longer ray path, and 


3.7.9 Physical models for anelasticity 

A common model for the anelastic processes in the earth caus¬ 
ing attenuation treats the material as a viscoelastic or standard 
linear solid, which combines elastic and viscous responses to 
an incident seismic wave. This model is represented by a spring 
with constant k 1 in parallel with a spring with constant k 2 and a 
dashpot with viscosity rj (Fig. 3.7-15). If a step function strain 
H(t) (0 for t < 0, 1 afterwards) is applied, the stress response 
includes an instantaneous elastic contribution from spring k 1 
and a delayed response from the dashpot and spring k 2 , 
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c j(t) = k 1 H(t) + k 2 e- t/ \ (55) 

where ris the relaxation time constant = rj/k 2 . 

The response to harmonic waves depends on the product 
of the angular frequency and the relaxation time. For wave 
periods that are very short compared to the relaxation time, the 
system responds mostly elastically, and there is little attenua¬ 
tion. For wave periods much longer than the relaxation time, 
the system responds mostly in a viscous manner, so there is no 
attenuative loss of energy. As shown in Fig. 3.7-15, the attenua¬ 
tion 8 varies as 


Q-V) 


k 2 COT 

k t 1 + (coz) 2 


(56) 




At very low or very high frequencies Q _1 (co) approaches zero, 
so Q becomes infinite. The greatest attenuation, or absorption 
peak, 9 occurs at cot = 1 , where 

Q-l x = Q~ 1 (VT) = k 2 /2k r (57) 


The phase velocity also depends on cot: 
k 2 (cot) 1 


c(g>) = c 0 


1 + 


2k t 1 + (cot) : 


(58) 


where c 0 = (k t /p) 112 . The phase velocity is lowest (c 0 ) at low fre¬ 
quencies, and reaches 

c„ = c 0 (l + kz/2k 1 ) = c 0 (l + Q-JJ (59) 

at high frequencies. This model thus has the key feature of the 
physical dispersion relation (Eqn 51) discussed earlier, that 
long-period waves travel more slowly than high-frequency 
waves. 

Given this model, the fact that seismological observations 
find relatively constant Q over a large range of low frequen¬ 
cies from about 0.001 to 0.1 Hz (Fig. 3.7-12) is surprising. 
Moreover, theoretical and laboratory studies of the physical 
mechanisms thought to cause attenuation in the earth also 
suggest that Q should be strongly frequency-dependent. Hence 
the relatively constant value at low frequencies is thought to 
result from the superposition of many different mechanisms. A 
possible explanation comes from noting that a typical attenua¬ 
tion spectrum for a polycrystalline structure (Fig. 3.7-16, top) 
contains multiple attenuation peaks or absorption bands. The 
absorption bands depend on the material’s composition and 
grain size and vary with temperature (recall Fig. 3.7-2) and pres¬ 
sure, such that higher pressure decreases attenuation, whereas 


8 Kanamori and Anderson (1977). 

9 This effect is like driving over a bump: at a high speed inertia keeps the car in line 
and the bump is not very noticeable. At low speed, we feel only a gradual swell in the 
road. However, at an intermediate speed the bump gives the maximum jolt. 



Fig. 3.7-16 Top: Relaxation spectrum for a polycrystalline material 
showing attenuation peaks at different frequencies due to different 
microscopic mechanisms. Bottom: Schematic model to explain the 
observation that Q is roughly constant over a wide range of frequencies. 
The superposition of absorption peaks for different compositions at 
different temperatures and pressures yields a flat absorption band. 

(Liu etai, 1976.) 


higher temperature increases it. Waves of various frequen¬ 
cies traversing the earth may feel the net effect of absorption 
bands with different relaxation times, yielding a flat absorption 
spectrum (Fig. 3.7-16, bottom). The higher-frequency waves 
in Fig. 3.7-12 that show a frequency-dependent Q would be 
above the flat part of the absorption spectrum. 

3.7.10 Q from crust to inner core 

Attenuation is inferred in all regions of the earth except for 
the liquid iron outer core, and varies greatly both laterally and 
vertically. In the crust, the greatest attenuation (lowest Q or 
highest Q” 1 ) occurs near the surface (Fig. 3.7-17), presumably 
due to the presence of fluids. Attenuation is lowest at about 
20-25 km depth, and then increases again, presumably due to 
increasing temperature. Attenuation decreases as a function of 
frequency, as in Fig. 3.7-12, and varies geographically. Q in the 
upper crust is roughly proportional to the time since the last 
major tectonic activity in a region, perhaps due to crack genera¬ 
tion and fluid flow during tectonism and gradual crack anneal¬ 
ing after tectonism ceased. 
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Fig. 3.7-17 Variation in attenuation with lithospheric depth for the 
eastern USA (EUS) and Basin and Range (B&R). Lower attenuation occurs 
for higher frequencies. (Mitchell, 1995. Rev. Geophys ., 33, 441-62, 
copyright by the American Geophysical Union.) 

Regional variations in crustal Q are often studied with Lg 
waves, a superposition of higher-mode surface waves that give 
prominent arrivals in continental regions. Q Lg for the USA var¬ 
ies regionally (Fig. 3.7-18), with values as high as 750 in the 
stable East and as low as 250 in the tectonically active West. 
This regional difference in attenuation, also seen in Figs 3.7-1 
and 3.7-17, has implications for seismic hazards (Section 1.2.2). 
Similarly, the fact that the USA tested nuclear weapons in the 
western USA, which is more attenuative than the areas used by 
the Soviet Union, is significant for verifying test ban treaties 
(Section 1.2.8). 


Attenuation in the upper mantle varies with depth, with the 
lowest Q in the asthenosphere from about 80 to 220 km depth 
(Fig. 3.7-19). At these depths the temperature approaches, and 
perhaps exceeds, the melting temperatures of rock, so a small 
percentage of partial melt may exist. This pattern of attenua¬ 
tion is similar to that for seismic velocities, which are lowest 
in the asthenosphere. Hence both the elastic velocity and 
anelastic attenuation reflect the physical processes causing the 
mechanically weak asthenosphere. Beneath the asthenosphere, 
Q increases gradually with depth, presumably because tem¬ 
perature increases at a slower rate than pressure. 

increases with depth through the lower mantle, reaching 
values in excess of 500. There is some indication that attenua¬ 
tion is enhanced in the D" region at the base of the mantle. 
Although no attenuation of P waves is detected for the outer 
core, there is significant attenuation of PKIKP waves traversing 
the inner core, yielding Q K estimates in the range of 150-300. 

Lateral variations in attenuation are studied using tomo¬ 
graphic methods similar to those used for velocity (Sections 
2.8.3, 7.3). Where temperatures vary over short distances, 
significant attenuation variations can occur, as shown in Fig. 
3.7-3 for a mid-ocean ridge. Similarly, a cross-section through 
the back-arc spreading center above the Tonga subduction 
zone (Fig. 3.7-20) shows that Q a exceeds 10,000 within the 
cold and rigid subducting slab, but is less than 75 beneath the 
hot back-arc basin. Such attenuation data, especially when 
combined with velocity data, are valuable for tectonic studies. 

3.8 Composition of the mantle and the core 

Seismology yields information about velocities within the 
earth. To derive inferences about the composition of the earth, 
the seismological data are combined with results from geology, 
geodesy, geomagnetism, cosmochemistry, and the physics and 



Fig. 3.7-18 Q Lg for the USA mapped 
from the codas of 1 Hz Lg waves. Q Lg , 
which reflects attenuation within the 
crust, shows higher attenuation in the 
tectonically active western USA and lower 
attenuation in the tectonically inactive east. 
(Mitchell etal, 1997.) 






Average Q model 



Depth (km) 

Fig. 3.7-19 Models of Q in the upper mantle showing that attenuation is 
highest at 80-220 km depth and then increases with depth. (Romanowicz, 
1995./. Geopbys . Res., 100 ,12,375-94, copyright by the American 
Geophysical Union.) 

chemistry of materials at high temperature and pressure. A 
general view of the earth’s composition has emerged, although 
aspects are still under investigation. This view is a cornerstone 
of our thinking about the evolution of the earth and other 
planets. We will summarize some basic ideas that are presently 
under discussion, and the suggested readings provide more 
information. 

3.8.1 Density within the earth 

A starting point for analysis of the earth’s composition is a 
model of the variation in density with depth. The density is 


an important constraint on the nature of the material, and 
can be combined with velocities to derive elastic constants. 
Densities are less well known than velocities, and their estima¬ 
tion requires more inferences. As with velocities, we use a 
radially symmetric density model for most applications and 
consider lateral perturbations when needed. 

The basic constraint on the earth’s density is that its average 
is given by the earth’s mass M, which can be found from the 
acceleration of gravity at the surface r - a using the law of 
gravitation, 

g=GM/a 2 . (1) 

Because g ~ 9.8 m/s 2 , G = 6.67 x 10~ n Nm 2 kg~ 2 , and a = 
6371 km, we find M = 5.97 x 10 24 kg. The mass is the volume 
integral of the density, so if density varies only with depth 

a 

M = 4k p{r)r 2 dr , (2) 

o 

the average density, p 0 , is found by dividing the mass by the 
volume, 

p 0 = M/[{4/3)7ta 3 ]. (3) 

The resulting average density of the earth is about 5.5 g/cm 3 . 
The fact that this value is significantly higher than the density 
of the surface rocks (about 3 g/cm 3 ) is evidence for a core of 
much denser, and hence presumably different, material. 

A second constraint on the density, which also indicates 
a dense core, comes from the moment of inertia about the 
rotation axis. This is defined by (Fig. 3.8-1) integrating over 
volumes dV, each at a distance 1 = r sin fifrom the spin axis, 



Fig. 3.7-20 Cross-section across the 
Tonga subduction zone, showing large 
lateral variations in between the cold 
subducting slab (black) and the hotter 
back-arc basin. (Roth etal., 1999. 

/. Geophys. Res., 104, 4795-809, 
copyright by the American 
Geophysical Union.) 








200 Seismology and Earth Structure 



Fig. 3,8-1 A planet’s moment of inertia is found by integrating about the 
spin axis. The moment arm, /, to a volume element, dV, is r sin 6. 
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The ratio of the moment of inertia to the mass gives a scalar 
that depends on the density distribution. If the earth were 
homogeneous, the density everywhere would equal the average 
density, p(r) = p 0 , and 


C — (8/1 5)7ta 5 p 0 , M=(4/3)na 3 p 0 , CIMa 2 = 0 A. (5) 


Alternatively, if ail the mass were in a shell at the surface, 
the density distribution could be written as a delta function 
p(r) - 5(r - a)p s . Using the properties of the delta function 
(Section 6.2.5), Eqns 2 and 4 yield 


C={Sl3)7tp $ a 4 , M = 4np s a 2 , CIMa 2 = 0.67. (6) 

As expected, a distribution with material concentrated toward 
the outside gives a larger ratio. 

A more realistic case is a two-shell planet, with a mantle of 
density p m and a core of density p c and radius r c . The integrals 
are evaluated in pieces as 


C = —K 
3 


p c r 4 dr + 


p m r 4 dr 


= —XiP m a5 + (Pc ~ Pm) r % 


M = 


4 ;K 

T 


[p„/* 3 +(p c -pj^]. 


(7) 


Values similar to those for the earth ( p c = 12 g/cm 3 , p m = 
5 g/cm 3 , r c = 3480 km) yield a moment of inertia ratio of 
CIMa 2 = 0.35. This value is less than the 0.4 which a uniform 
planet would have, because the material is concentrated toward 
the center. It is similar to the value of CIMa 2 for the earth 
determined from the earth’s shape and gravity field. The earth’s 
value, about 0.33, thus indicates the presence of a dense core. 

Although the mass and moment of inertia give only integral 
constraints on the density, seismic velocities give information 
on the variation of density with depth. We first consider a 
region of uniform material and see how the density increases 
with depth as the material is self-compressed by its own weight. 
At a radius r, the gradient of the hydrostatic pressure P(r) is 



( 8 ) 


where p(r) and g(r) are the density and the acceleration of grav¬ 
ity at that depth. The derivative is negative, because pressure 
increases with depth. The local value of gravity, g(r), depends 
on the total mass m(r) within the sphere of radius r, 1 

g=Gm/r 2 . (9) 

The pressure derivative can then be written as 


dP _ -pGm 
dr r 1 


( 10 ) 


The elastic constants of the material are introduced using the 
definitions of the density and the dilatation 0 (Eqn 2.3.60), 

p = mfV, d0=dVIV , (11) 

so that differentiation yields 

dp = -(m/V 2 )dV=-pd6. (12) 

Thus the bulk modulus K can be expressed, starting with its 
definition (Eqn 2.3.74), as 

r _ dP dP dp dP 

K = -=-- = p —. (13) 

d6 dp d6 dp 

Combining this with the pressure derivative equation (Eqn 10) 
gives the change in density with depth 


1 g{r) depends only on the mass below radius r, because a spherical shell of uniform 
density has no net gravitational effect. This situation arises because gravity varies as 
r -2 , whereas the shell’s mass varies as r 1 , so larger contributions from the closer por¬ 
tions of the shell are canceled by those from the rest. The fact that a sphere’s gravita¬ 
tional attraction is the same as if all its mass were at the center arises in the same way. 
This effect is not a general property of the center of mass and does not apply for bodies 
of other shapes. However, it applies for the electric field, which also varies as r~ 2 , 
within a uniformly charged sphere. Deriving this result is said to have delayed Newton 
for years before presenting the theory of gravitation in 1686. (Feynman et ai, 1963.) 


(14) 


0.7 


Shell 


dp _ dp dP -p 2 Gm 
dr dP dr Kr 2 


To include the observations of seismic velocities, we define 
the seismic parameter , <F, and bulk sound speed , <F 1/2 , such that 


<3>= a 2 - (4/3)/? 2 = K/p. 


(15) 


Thus we can write the Adams-Williamson equation relating 
the velocity structure to the derivative of density with radius, 


dp _ - p{r)Gm{r ) = -p{r)g{r) 
dr ®{r)r 2 O(r) 


(16) 


where the dependences on radius are explicitly shown. This 
equation can be used to estimate the density structure by 
starting with the near-surface density, using the seismic velocit¬ 
ies to find its derivative, and computing the density at a deeper 
point. The resulting density and value of g(r) are then used in 
the next step. 

However, density increases with depth as a result of mineral 
phase changes as well as of self-compression, so the Adams- 
Williamson equation is insufficient. This difficulty was identi¬ 
fied in 1936 by K. Bullen, who used the Adams-Williamson 
approach to find the density throughout the mantle. He then 
computed the moment of inertia of the mantle and subtracted it 
from the moment of inertia of the earth, to find the moment of 
inertia of the core. Figure 3.8-2 shows the CIMa 2 value calcu¬ 
lated for the core as a function of the assumed density at the top 
of the mantle, which is the initial density for the Adams- 
Williamson calculation. For reasonable values of near-surface 
density, ~ 3.3 g/cm 3 , the core would have CIMa 2 greater than 
0.4, implying that density decreases with depth in the core. This 
seems unlikely, because the solid inner core should be denser 
than the liquid outer core. Only implausibly high near-surface 
densities could cure the problem. 

This issue was resolved in the 1950s by F. Birch 2 in a classic 
series of papers showing that at least one of two assumptions 
underlying the method was inappropriate. One implicit as¬ 
sumption is that the temperature increases with depth along 
an adiabatic gradient , or “adiabat,” such that if a piece of 
material moves vertically, the pressure-induced temperature 
change leaves the material at the same temperature as its new 
surroundings (Eqn 5.4.10). However, the temperature gradient 
in the mantle is thought to exceed the adiabatic gradient, 
because a superadiabatic gradient is required for the thermal 
convection expected in the mantle. 3 The superadiabatic gra¬ 


2 Francis Birch (1903-92) pioneered the use of rock and mineral physics in studies of 
the earth’s composition. 

3 For an adiabatic gradient, rising material reaches the same temperature, and hence 
density, as its surroundings, and thus has no tendency to continue rising. Flowever, 

for a superadiabatic gradient, the rising material remains hotter and less dense than its 
surroundings, and thus tends to continue rising. 



Fig. 3.8-2 Moment of inertia ratio of the earth’s core as a function 
of density at the top of a uniform mantle. For any realistic upper 
mantle density the ratio would exceed 0.4, implying that the outer 
core is denser than the inner core. The alternative is that density 
increases beyond self-compression occur in the mantle. (After Birch, 
1954. Trans. Am. Geophys. Un., 35, 79-85, copyright by the 
American Geophysical Union.) 


dient can be included by modifying the Adams-Williamson 
equation (16) to 


dp 

dr 


PI 

P 



(17) 


where a is the coefficient of thermal expansion, 4 and t is the 
portion of the temperature gradient exceeding the adiabatic 
gradient. This correction for higher temperature lowers the 
calculated mantle densities, and hence increases the calculated 
CIMa 1 for the core, making the problem of the core density 
structure worse. 

Hence the assumption of homogeneous material whose 
density changes only by self-compression must be incorrect. 
Birch showed that inhomogeneity can be identified using the 
function 1 - (1 lg)dpldr. Figure 3.8-3 compares values of this 
function derived from seismic velocity data with values pre¬ 
dicted for compression of homogeneous mantle material. Below 
1000 km the mantle behaves as a homogeneous material, while 
at shallower depths it does not. This is because the mineral 
phase transitions expected at the 410 and 660 km discontinuit¬ 
ies involve denser atomic packings, and therefore transitions to 
higher densities, than predicted by the Adams-Williamson 
equation. 

As a result, density models of the earth include rapid changes 
in the transition zone. Figure 3.8-4 shows the velocity and 
density structure for earth model PREM (Table 3.8-1). Within 
the lower mantle, outer core, and inner core, density increases 
smoothly with depth according to the Adams-Williamson 
equation. At the boundaries between these regions, density 


4 The coefficient of thermal expansion, which gives the change in density with 
temperature T, is a~ {~l/p)dpldT. 
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Fig. 3.8-3 Comparison of observed values (dots) of the function 
1 -- < g r ~ 1 {A0/Ar) for the mantle, with calculated values (line) of this 
function for compression of homogeneous material. In the upper 
mantle transition zone, self-compression alone cannot be occurring, 
motivating the expectation of mineralogical phase transformations. 
(After Birch, 1952./. Geopbys. Res., 57, 227-86, copyright by the 
American Geophysical Union.) 
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Fig 3.8-5 Density, gravity, pressure, and mass as functions of depth for 
the PREM model. 
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Fig. 3.8-4 Seismic velocities and density for the Preliminary Reference 
Earth Model (PREM). (Dziewonski and Anderson, 1981.) 


changes sharply. The CMB is the most significant boundary 
with respect to density, with an increase from 5.57 g/cm 3 for 
mantle rock to 9.90 g/cm 3 for the liquid iron outer core. 
Density also changes sharply at the 410 and 660 km discon¬ 


tinuities. Such models are developed to satisfy the travel time 
data, other seismological data including eigenfrequencies of 
the earth’s normal modes, and the constraints on density. 

A density profile lets us compute a pressure profile, and thus 
use the results of experiments showing which mineral phases 
exist at particular pressures. To do this we integrate both sides 
of Eqn 8, 


P(r)=- 


g(r)p(r)dr , 


(18) 


using p(r) and the resulting values of g{r). As shown in Fig. 3.8- 
5, pressure starts at 1 bar at the surface, and rises to about 
13.3 GPa (133 kbar) at the 410 km discontinuity, 23.8 GPa at 
the 660 km discontinuity, 136 GPa at the CMB, 329 GPa at 
the ICB, and 364 GPa at the center of the earth. 

The curve for gravity is interesting. Gravity averages 
9.8 m/s 2 at earth’s surface, 5 and is zero at earth’s center, where 
the mass of the earth pulls evenly in all directions. Gravity 


5 The value of gravity at the surface is a complicated function, varying laterally as 
a result of density anomalies within the earth, dynamic forces that lift up or pull 
down the surface, and a latitudinal effect due to the ellipsoidal shape of the earth 
(Section A.7.2). 










Table 3.8-1 PREM Model. 



Depth (km) 

p (g/cm 3 ) 

or (km/s) 

P (km/s) 

Ocean 

0.0 

1.020 

1.450 

0.000 


3.0 

1.020 

1.450 

0.000 

Crust 

3.0 

2.600 

5.793 

3.191 


15.0 

2.600 

5.793 

3.191 


15.0 

2.900 

6.792 

3.889 


25.0 

2.900 

6.792 

3.889 

Upper mantle 

25.0 

3.381 

8.101 

4.479 


40.0 

3.379 

8.091 

4.473 


60.0 

3.377 

8.079 

4.465 


80.0 

3.375 

8.067 

4.457 


80.0 

3.375 

8.005 

4.377 

low-velocity zone 

115.0 

3.371 

7.984 

4.363 


150.0 

3.367 

7.963 

4.350 


185.0 

3.363 

7.942 

4.338 


220.0 

3.359 

7.920 

4.325 


220.0 

3.436 

8.519 

4.589 


265.0 

3.463 

8.606 

4.620 


310.0 

3.490 

8.692 

4.651 


370.0 

3.516 

8.778 

4.683 


400.0 

3.543 

8.865 

4.714 

Transition zone 

400.0 

3.724 

9.092 

4.874 


450.0 

3.787 

9.347 

5.019 


500.0 

3.850 

9.601 

5.163 


550.0 

3.913 

9.856 

5.307 


600.0 

3.976 

10.111 

5.451 


635.0 

3.984 

10.165 

5.478 


670.0 

3.992 

10.219 

5.505 

Lower mantle 

670.0 

4.381 

10.727 

5.913 


721.0 

4.412 

10.885 

6.061 


771.0 

4.443 

11.040 

6.207 


871.0 

4.504 

11.219 

6.277 


971.0 

4.563 

11.390 

6.344 


1071.0 

4.621 

11.552 

6.407 


1171.0 

4.678 

11.707 

6.469 


1271.0 

4.735 

11.856 

6.527 


1371.0 

4.790 

11.998 

6.583 


1471.0 

4.844 

12.135 

6.637 


1571.0 

4.898 

12.266 

6,689 


1671.0 

4.951 

12.394 

6.739 


1771.0 

5.003 

12.518 

6.788 


1871.0 

5.055 

12.638 

6.836 


1971.0 

5.106 

12.757 

6.882 


2071.0 

5.157 

12.873 

6.928 


2171.0 

5.207 

12.988 

6.973 


2271.0 

5.257 

13.103 

7.017 


Source: Dziewonski and Anderson (1981). 

increases slightly across the mantle, reaching a maximum of 
10.7 m/s 2 at the CMB because of the high density of the core 
relative to the mantle. Inside the core, gravity decreases nearly 
linearly toward the earth’s center. The high density of the core 
is also shown by the mass distribution; the core has only 16% 
of the earth’s volume, but has almost one-third of the mass. 


Depth (km) p (g/cm 3 ) or (km/s) (3 (km/s) 

2371.0 5.307 13.218 7.061 

2471.0 5.357 13.333 7.106 

2571.0 5.407 13.450 7.150 

2671.0 5.457 13.568 7.195 

2741.0 5.491 13.652 7.227 

2771.0 5.506 13.659 7.226 

2871.0 5.556 13.684 7.226 

2891.0 5.566 13.689 7.225 

Outer core 2891.0 9.903 8.065 0.000 

2971.0 10.029 8.199 0.000 

3071.0 10.181 8.360 0.000 

3171.0 10.327 8.513 0.000 

3271.0 10.467 8.658 0.000 

3371.0 10.602 8.795 0.000 

3471.0 10.730 8.926 0.000 

3571.0 10.853 9.050 0.000 

3671.0 10.971 9.167 0.000 

3771.0 11.083 9.278 0.000 

3871.0 11.191 9.384 0.000 

3971.0 11.293 9.484 0.000 

4071.0 11.390 9.579 0.000 

4171.0 11.483 9.668 0.000 

4271.0 11.571 9.754 0.000 

4371.0 11.655 9.835 0.000 

4471.0 11.734 9.912 0.000 

4571.0 11.809 9.985 0.000 

4671.0 11.880 10.055 0.000 

4771.0 11.947 10.123 0.000 

4871.0 12.010 10.187 0.000 

4971.0 12.069 10.249 0.000 

5071.0 12.125 10.309 0.000 

5149.5 12.166 10.355 0.000 

Inner core 5149.5 12.764 10.987 3.434 

5171.0 12.775 10.995 3.440 

5271.0 12.825 11.030 3.465 

5371.0 12.871 11.063 3.487 

5471.0 12.912 11.092 3.508 

5571.0 12.949 11.119 3.526 

5671.0 12.982 11.142 3.542 

5771.0 13.010 11.162 3.556 

5871.0 13.034 11.179 3.568 

5971.0 13.054 11.193 3.578 

6071.0 13.069 11.204 3.585 

6171.0 13.080 11.212 3.590 

6271.0 13.086 11.217 3.594 

6366.0 13.088 11.218 3.595 

6371.0 13.088 11.218 3.595 


3.8.2 Temperature in the earth 

Seismology gives insight into the geotherm , the temperature as 
a function of radius, which both controls and reflects the com¬ 
position, mineralogy, and evolution of the earth. The geotherm 
depends on the sources of heat and modes by which the heat 
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is transferred upward in the earth. Thermal convection, heat 
transfer by the motions of material due to the density changes 
resulting from temperature, occurs in the mantle. The most 
obvious manifestations of this convection are the mid-ocean 
ridges, which are its hot upwelling limbs, and subducting 
plates, which are its cold downwelling limbs. A separate con¬ 
vection system in the fluid outer core is believed to cause the 
earth's magnetic field. In addition, heat is transferred by con¬ 
duction through the lithosphere, the core-mantle boundary, 
and the inner core, which may also be convecting. 

The geotherm is harder to estimate than the pressure profile 
and remains a subject of debate. A geotherm is inferred by 
modeling radioactive generation of heat in the crust and the 
mantle, conduction of heat across the lithosphere, CMB, and 
inner core, and adiabatic temperature gradients associated 
with convection in the mantle and the outer core. The predicted 
temperatures are required to match the expected temperatures 
of the phase transitions in the transition zone and the expected 
freezing point of iron at the ICB. 6 Given the uncertainties 
involved, estimates of the temperature at the center of the 
earth vary from 5000 K to almost 7000 K, 7 with recent work 
favoring the lower end of this range. 

A sample geotherm for the mantle is shown in Fig. 3.8-6. The 
most striking feature is the contrast between the shallow tem¬ 
perature gradient in the mid-mantle and the steep gradients in 
the upper and lower thermal boundary layers, the lithosphere 
and D". The difference reflects the assumptions that heat is con¬ 
ducted primarily through the boundary layers, giving the steep 
gradients, but is convected between them, yielding a shallower 
near-adiabatic gradient. The predicted temperature rises from 
about 0°C at the surface to about 1300°C at a depth of 100 km, 
giving an average thermal gradient of 13°C/km. From there 
to the base of the mantle the temperature rises only another 
1600°C, corresponding to a low gradient of only about 
0.6°C/km. Over the bottom few hundred kilometers of the 
mantle, however, the temperature rises another 1400°C to a 
CMB temperature of about 4300°C (-4000 K). Thus the tem¬ 
perature changes across the boundary layers at the surface and 
CMB are comparable. However, because the surface area of 
the CMB is only about 30% of the earth’s surface, much more 
heat flows out of the earth than flows out of the core. Most of 
this extra heat is generated by the decay of radioactive isotopes 
in the mantle and the crust. An important caveat is that if there 
are additional thermal boundary layers in the mantle, or if the 
thermal conductivity of the mantle is higher than expected, 
the temperatures in the lower mantle will be elevated, and the 
temperature change across D" will be less. 

The geotherm gives insight into the variations with depth of 
seismic velocity and attenuation and the strength, or stress, the 
material can support (Section 5.7). Higher temperatures reduce 

6 Although our instincts based on water make it strange to think of temperatures 
near 5000° as “freezing,” this occurs as the solid inner core forms from the liquid 
outer core. 

7 Temperatures in the deep earth are often given as absolute (Kelvin) temperatures, 
equal to the Celsius temperatures plus 273.15°. 



Fig. 3.8-6 A sample mantle geotherm with steep temperature gradients 
in the thermal boundary layers at the top and bottom of the mantle and a 
near-adiabatic gradient in the lower mantle. The melting curve, or solidus, 
is also shown. Temperatures are given in absolute temperature. (Stacey, 
1992. From Physics of the Earth , 3rd edn, copyright © 1992 by John 
Wiley & Sons, Inc. (New York). Reprinted by permission.) 

seismic velocity and strength, but increase attenuation. Con¬ 
versely, higher pressures increase the velocity and strength, 
but reduce attenuation. These properties thus depend on the 
balance between the temperature and the pressure. The cold 
lithosphere has high velocity and low attenuation, and behaves 
as rigid plates. However, the rapidly increasing temperature 
with depth brings the geotherm close to, if not above, the 
solidus , or melting temperature curve. This yields the low- 
velocity zone, where there is high attenuation and weak mate¬ 
rial that forms the asthenosphere underlying the moving plates. 
In the lower mantle, temperatures are only slightly greater 
than in the asthenosphere, so the higher pressures make the 
rock stronger. Hence the lower mantle is thought to have 
a viscosity that is about 100 times greater than that in the upper 
mantle. Temperatures increase rapidly in D", causing velocities 
slower than expected from the lower mantle velocity gradient. 
The uitra-low-velocity zone at the base of the mantle may be 
due to partial melting, showing that the geotherm has inter¬ 
sected the solidus. As discussed later, the high temperatures 
in the core keep the outer core liquid, but the rapid increase in 
pressure due to the weight of the outer core makes the inner core 
freeze into a denser solid. The inner core is therefore close to 
the melting temperature of iron, so it has low shear velocities. 

3.8.3 Composition of the mantle 

Models of the composition of the mantle are derived by com¬ 
paring the velocity and density (and therefore pressure) profiles 
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Fig. 3.8-7 Bulk sound speed as a function of density for various materials, 
obtained from experiments (lines) compared to the range for the mantle 
and the core from seismic observations and density models (shaded). Also 
shown are the results for a dunite rock and the composition Fe 2 Si. The 
numbers shown are mean atomic numbers. (After Birch, 1968. Phys. 
Earth Planet. Inter., 1, 141-7, with permission from Elsevier Science.) 


derived from seismic data to temperature profiles and results 
for earth materials at high pressure and temperature. A key 
result from experiments is that the bulk sound speed (Eqn 15) 
and the density for a material are approximately linearly 
related for a given mean atomic weight. The mean atomic 
weight is the mean molecular weight of a formula unit, such 
that forsterite (magnesian olivine) Mg 2 Si0 4 has m = (2 x 24 + 
28 + 4 x 16)/7 = 20, and fayalite (iron olivine) Fe 2 Si0 4 has 
m = (2 x 56 + 28 + 4 x 16)/7 = 29. Figure 3.8-7 shows this result 
for various elements whose atomic numbers 8 are labeled. 
Also shown are ranges of density and bulk sound speed for the 
mantle and core derived from seismically based models. The 
mantle and the core occupy different parts of the plot. 

This result suggests that the mantle and the core are chem¬ 
ically different, and provides a way of testing which chemical 
compositions are plausible. Dunite, a rock containing 92% 
olivine, which in turn is 90% forsterite, fits the mantle data. 
Curves for more iron-rich olivine would plot further to the 
right, such that olivine with more than 50% fayalite would be 
outside the range observed for the mantle. 

The core data plot much further to the right, indicating that 
the core is composed of material of higher atomic number. The 
data are to the left of the curve for pure iron, suggesting that the 
core is composed of iron plus a lower atomic weight (“lighter”) 

8 The atomic number is the number of protons, whereas the atomic weight is the 
number of protons and neutrons. 


Table 3.8-2 Pyrolite model, mineralogy above transition zone. 


Mineral 

Composition 

wt(%) 

Olivine (Fo 89 ) 

(^9q,89' 

57 

Orthopyroxene 

(Mg, Fe)Si0 3 

17 

Clinopyroxene 

(Ca, Mg, Fe) 2 Si 2 0 6 -NaAISi 2 0 6 

12 

Pyrope-rich garnet 

(Mg, Fe, Ca) 3 (AI, Cr) 2 Si 3 0 12 

14 


Source: Ringwood (1979). 


element. For example, the composition Fe 2 Si (iron plus 20% 
weight Si) fits the core data. 

Various chemical models for the mantle have been proposed. 
The concepts involved can be illustrated by considering a pro¬ 
posed composition called pyrolite that satisfies various petro¬ 
logical, cosmochemical, and geophysical constraints. Pyrolite 
is similar to natural peridotites (Fig. 3.2-23), which are accept¬ 
able source rocks for basaltic magmas that result from partial 
melting of mantle rock. The variation in seismic velocity and 
density with depth is assumed to result from transformations 
to denser phases as a result of increased pressure. Table 3.8-2 
gives a composition whose density at surface temperature 
and pressure conditions would be 3.38 g/cm 3 and has P- and 
5-wave velocities consistent with those observed for the upper 
mantle. 

In the upper mantle, the model’s major mineral component is 
olivine. Such a composition satisfies the density and bulk sound 
speed data (Fig. 3.8-7) and is consistent with the observed seis¬ 
mic anisotropy (Fig. 3.6-4). The transition zone corresponds 
to a series of solid state phase changes (Fig. 3.8-8). Olivine 
undergoes several transformations before converting to a 
perovskite structure in the lower mantle. Pyroxene first trans¬ 
forms to garnet, and somewhat deeper, the calcium-bearing 
component of the garnet transforms to a perovskite structure. 
Because of the predicted predominance of perovskite (—70%) in 
the voluminous lower mantle, perovskite is the most abundant 
material in the earth. 

Figure 3.8-9 shows the predicted volume fraction of the 
major mineral phases as a function of depth. The a phase 
of olivine, which occurs in the crust and the upper mantle, 
transforms with increased pressure to its p phase wadsleyite , 
which has a modified spinel structure. This transformation 
is observed experimentally to occur at a pressure of about 
12 GPa (120 kbar), corresponding to the 410 km discontinu¬ 
ity. The p phase transforms to a 7 , or spinel, structure known as 
ringwoodite (Fig. 3.8-10) at a pressure of -15 GPa, corres¬ 
ponding to the less dramatic seismic discontinuity at 520 km. 
At pressures above about 24 GPa, corresponding to the 660 km 
discontinuity, 7 spinel breaks down to a perovskite structure 
and (Mg, Fe)0 magnesiowustite. 

The (Mg,Fe)SiO s pyroxene component also undergoes 
changes, beginning with a transformation to garnet below about 
200 km. Below 600 km, some of the Mg-bearing garnet, majorite, 
transforms to a structure called ilmenite. Beneath about 660 km, 
the majorite/ilmenite transforms to perovskite. Some of the 
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Fig. 3.8-8 Predicted mineral assemblages 
as a function of depth for a mantle of 
pyrolite composition. (Ringwood, 1979. 
Composition and origin of the earth, in The 
Earth, Its Origin, Structure and Evolution , 
ed. M. W. McElhinny, copyright 1979 by 
Academic Press, reproduced by permission 
of the publisher.) 


majorite probably survives into the lower mantle as stishovite , 
a high-pressure phase of quartz, and an Al 2 0 3 -rich phase. 
Unlike the olivine transformations that cause distinct seismic 
discontinuites in the transition zone, the pyroxene and garnet 
transformations occur gradually and contribute to a high 
velocity gradient through the transition zone down to about 
770 km (Section 3.5.4). 

These phase changes are investigated using experiments that 
simulate the pressures, temperatures, and compositions in the 
earth. Because the experiments are difficult, extrapolations 
of lower pressure and temperature data via thermodynamic 
calculations are also used. An important factor for the velocity 
structure is that some phase transformations happen gradually 
over a range of depths (Fig. 3.8-11). A simple univariant phase 
change, in which material of a single composition changes 
completely from one phase to another as pressure increases, 
causes a sharp discontinuity in velocity. A more complicated 
multivariant phase change involving a system of variable 
compositions causes two or more phases to coexist over a 


broad region of pressure, and so produces a velocity gradient. 
Thus seismological studies that better define the velocity struc¬ 
ture of the transition zone improve our understanding of its 
composition. 

The mineralogical models agree with the depths of the seismic 
discontinuities and their other characteristics. The olivine 
oc-to-p reaction should occur over a narrow depth range, as 
shown by the volume fractions in Fig. 3.8-9. This prediction 
is consistent with the sharpness of the seismic discontinuity, 
which is observed with high-frequency (short-wavelength) 
waves. The transformation is exothermic (releasing heat) and 
hence would occur at lower pressures in subducting slabs due 
to the colder temperatures (Section 5.4.2). This expectation 
agrees with seismic observations showing an elevation of the 
410 km discontinuity in and around subducting lithosphere. 
By contrast, the /3-to-y transformation should occur over a 
broader depth range. This prediction agrees with seismic 
observations of the 520 km discontinuity, which is invisible to 
high-frequency waves and seen only with longer wavelengths. 
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Fig. 3.8-9 A model for the relative proportions of major mineral phases 
as a function of depth in the upper mantle. The rapid changes between the 
olivine and spinel phases {a, /?, y) cause seismic discontinuities at depths 
of 410 km and 520 km, whereas the gradual transformation of pyroxene 
to garnet steepens the velocity gradient in the transition zone (410 km to 
660 km). (Weidner, 1986. Reproduced with permission of Springer- 
Verlag.) 


Olivine 



Fig. 3.8-10 Comparison of the crystal structures of (Mg, Fe) ? Si0 4 , in 
its low-pressure a olivine phase {top) and its y-spinel ringwoodite phase 
{bottom), which is about 10% denser. Spheres correspond to ions of 
oxygen (white), silicon (black), and magnesium/iron (grey). (After Press 
and Siever, 1982.) 


Univariant reaction 



Composition 

Phase compositions remain 
constant across boundary. 



Divariant reaction 



Composition 

Phase compositions vary 
with depth in transition region. 



Fig. 3.8-11 Schematic phase diagrams showing the relation between the 
nature of a phase change and the corresponding velocity discontinuity. 
(Bina and Wood, 1987./. Geophys. Res., 92,4853-66, copyright by the 
American Geophysical Union.) 


However, the y-spinel to perovskite and magnesiowustite tran¬ 
sition should occur over a narrow depth range, consistent with 
the observed sharpness of the 660 km seismic discontinuity. 
The reaction is endothermic (absorbs heat) and so should occur 
at greater depths for colder temperatures. Studies have shown 
that the discontinuity is depressed to depths of 700 km or more 
in and around subducting lithosphere. 

An unresolved question is whether the lower mantle is chem¬ 
ically distinct from the upper mantle, which has important 
implications for how the two have mixed during the earth’s 
evolution. In models like those depicted in Fig. 3.8-8, the two 
are assumed to have the same bulk chemistry, and the increas¬ 
ing velocity and density in the lower mantle result from self¬ 
compression. The velocity data do not appear to require phase 
changes in the lower mantle. However, the lower mantle may be 
denser than expected for pyrolite, and hence perhaps enriched 
in iron and silica. The observation that some subducting 
lithosphere penetrates the 660 km discontinuity (Section 5.4) 
indicates that mixing occurs. However, even if all slabs reach 
the lower mantle, the earth may not be old enough for the 
lower and upper mantles to be well mixed. 9 Another possibility 
is that the early earth had distinct upper and lower mantle 
convection systems, and whole mantle convection began later. 


9 Consider a bowl of cake batter after only a few beats of a mixing spoon. 
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3.8.4 Composition ofD" 

Seismic observations give a picture of the D" region (Sec¬ 
tion 3.5.4) that includes lateral velocity variations, vertical 
layering, and anisotropy. Hence processes there may be as com¬ 
plex as in the lithosphere, the other major thermal boundary 
layer. This complexity may reflect factors including subducted 
lithosphere, the generation of mantle plumes, and interactions 
between the core and the mantle. 

Figure 3.8~12a (right) shows a simple convection model, with 
cold material sinking to the CMB, heating up from contact 
with the core, and then rising again. The left side of the figure 
shows the resulting vertical velocity profiles in regions of 
downwelling (solid line) and upwelling (dashed line). Thus 
the large (> ±5%) lateral seismic variations at the base of the 
mantle would be caused by temperature variations. However, 
given the complex seismic structures observed, this model 
component seems necessary but insufficient. 

The other possibilities shown involve subducted slabs. In 
Fig. 3.8-12b, the subducted slabs do not reach the top of the 
core, but remain separated by a chemically distinct layer. This 
layer may result from early planetary differentiation, or may 
have grown by chemical reactions between the mantle and the 
core. High-pressure experiments imply that perovskite and 
magnesiowustite would react with iron. These mantle dregs 
might be thinned in regions of mantle downwelling, and 
thickened beneath upwellings. Layering in the dregs may explain 
observations of transverse isotropy in downwelling regions 
and azimuthal anisotropy in upwelling regions (Section 3.6.6). 
The velocity increase of the D" discontinuity may be partly 
caused by ponded slab material, which will still be colder and 
have higher velocity than ambient rock. This discontinuity 
may be enhanced by dregs flowing up and over ponded slabs. 
The ultra-low-velocity zone (ULVZ) at the very bottom of the 
mantle may be due to the lower velocities of an iron-rich layer 
or to partial melting within it. 

Another possibility is that the part of the subducted litho¬ 
sphere that started as basaltic ocean crust and then trans¬ 
formed to eclogite transforms to a material that is seismically 
faster than the rest of the lower mantle (Fig. 3.8-12c). This 
phase could delaminate from the slabs and accumulate, form¬ 
ing a different chemical boundary layer. If it remained solid, 
it might partially explain the D" discontinuity. Alternatively, 
if it melted, it might explain the ULVZ. Either way, its laminar 
nature might explain the observed seismic anisotropy. The 
lateral variations in velocity would correlate with anisotropy; 
SH waves would travel fast in downwelling regions because of 
transverse isotropy, but be slowed by the vertical laminations 
beneath upwellings. 

D" may also signify the bottom of the perovskite stability 
field (Fig. 3.8-12d). Large radial changes in temperature and/or 
composition at the base of the mantle could move perovskite 
or a secondary phase out of its range of stability, causing a 
phase transformation. One possibility Is a transformation of 
perovskite to stishovite and magnesiowustite, which occurs 








Seismic velocity 


Fig. 3.8-12 Schematic diagram of processes that might cause the velocity 
structures observed at the base of the mantle. Right panels show the 
different scenarios, and left panels show the resulting velocity-depth 
profiles for regions of downwelling (solid lines) and upwelling (dashed 
lines). The scenarios, discussed in the text, are (a) general thermal 
convection, (b) the interaction of subducted slabs with a chemical 
boundary layer consisting of dense mantle dregs, (c) a chemical boundary 
layer formed from delaminated post-eclogitic ocean crust brought down 
with the slabs, and (d) a mineralogical phase change. (After Wysession, 
1996a. Subduction , 369-84, copyright by the American Geophysical 
Union.) 

with an increase in the iron/magnesium ratio. Stishovite has 
high seismic velocities and might contribute to the D" discon¬ 
tinuity. In this case, anisotropy might reflect orientation of 
crystals due to lateral flow. The denser magnesiowustite might 
settle to the bottom, forming the ULVZ. 

Given our limited knowledge, D" may involve these and other 
effects. For example, if the vertical temperature difference 
across D" is small (about 300°C), then convection should play 
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Fig. 3.8-13 Possible relationships between 
the geotherm (dashed line) and the solidus 
(solid line), for the inner and outer cores. 
Left-. If the core is homogeneous, the solidus 
should be continuous across the inner and 
outer cores, so the gradient of the geotherm 
must be shallower than that of the solidus 
for the inner core to be solid and the outer 
core to be liquid. Right: If the inner and 
cores are chemically different, the solidus 
can differ between them, allowing a steeper 
gradient for the geotherm. 



Different inner core/outer core compositions 



a lesser role relative to a chemical boundary layer. If the con¬ 
trast is large, perhaps 1500°C, plume generation should be 
more significant, and it would be harder to maintain a distinct 
chemical layer. 

3.8.5 Composition of the core 

Interesting issues about the core also remain unresolved. The 
density and bulk sound speed data (Fig. 3.8-7) suggest that the 
core has a composition similar to that of iron, but with a less 
dense element of lower atomic number added. Other argu¬ 
ments for an iron core are from cosmochemistry. Meteorites 
are roughly divided into stony meteorites, resembling the 
mantle, and iron meteorites, composed of an iron-nickel alloy, 
which are thought to be similar to the core. 10 Convection of 
molten iron is also considered the only suitable mechanism for 
generating the earth’s magnetic field. The light element lower¬ 
ing the core density is unknown: candidates include sulphur, 
silicon, oxygen, potassium, and hydrogen. Laboratory experi¬ 
ments suggest that 10-15% of a light element would yield an 
acceptable density. 

It may seem surprising that the inner core is solid, because it 
should be at a higher temperature than the liquid outer core. 
Thus the effects of pressure favoring the denser solid phase must 
exceed those of temperature. From the ICB to the center of the 
earth, temperature is thought to increase by only 100-200°C, 
or about 3% of the inner core temperatures, which are about 
5000°C. Pressure, however, is thought to increase about 11%, 
from about 329 GPa at the ICB to 364 GPa at earth’s center 
(Fig. 3.8-5). The density inferred from the seismological data is 
consistent with that for solid iron expected from experiments 
and modeling. 

This situation requires that the inner core geotherm be at 
temperatures below the melting temperature curve (solidus), 
whereas the outer core geotherm must be above the solidus. 

10 Iron meteorites — and thus presumably the solid inner core — are like steel, 
recalling legends in which swords forged from meteorites are very strong and have 
magical powers. 
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Fig. 3.8-14 Melting relations for the Fe-FeS system at the pressure of 
the core-mantle boundary (1.4 Mbar). When a cooling liquid with 33% 
FeS reaches the phase boundary, solid Fe freezes out, enriching the liquid 
in FeS. In this analogy, the inner core is freezing out from, and thus 
chemically different from, the outer core. (Data from Usselman, 1975.) 

Two suggestions have been offered for this effect. If the inner 
and outer cores were chemically identical (Fig. 3.8-13, left), the 
solidus should rise smoothly with depth. The geotherm would 
be shallower than the solidus, so that they intersect at the ICB, 
but steeper than the adiabatic gradient required for convec¬ 
tion in the outer core. However, some theoretical calculations 
suggest that the superadiabatic temperature gradient in the 
core required for convection would be steeper than the solidus. 
If so, the solid inner and liquid outer cores can be explained by 
assuming that the inner core is chemically different from the 
outer core, and thus has a different melting curve (Fig. 3.8-13, 
right). Thus, only in the inner core does the geotherm lie below 
the solidus and result in a solid phase. 

Figure 3.8-14 illustrates this idea, assuming that the light ele¬ 
ment in the core is sulfur. In this phase diagram for the Fe-FeS 
system extrapolated to core conditions, sulphur significantly 
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lowers the melting temperature of iron. Cooling a liquid iron 
mixture with 12% sulphur, corresponding to 33% FeS, causes 
solid Fe to freeze out, leaving the liquid richer in FeS. 11 In this 
analogy, the outer core corresponds to the FeS-rich liquid, and 
the inner core to the denser Fe solid. The nickel would also 
preferentially enter the solid phase. Such a model predicts an 
inner core of approximately 80% Fe and 20% Ni, and an outer 
core with 86% Fe, 12% S, and 2% Ni. The inner core's freezing 
is thought to be crucial to the convection in the outer core, 
because the sinking iron releases gravitational potential energy. 

It has been estimated that the outer core’s convection is driven _ 

y* 

in approximately equal fractions by this process, the latent heat ^ 
of the crystallizing inner core, and the loss of primordial heat. % 
An additional contribution might come from radiogenic heat | 
production from potassium or uranium, if either are present. * 

Such models suggest that the boundary between the inner 
and outer cores is both a phase boundary and a compositional 
boundary, like the CMB. The boundary may be quite complex. 

Some evidence suggests that the attenuation of PKP-DF waves 
is greatest in the outer few hundred km of the inner core, imply¬ 
ing that this zone may be somewhat mushy. It has also been 
suggested that iron crystallizes at the ICB at some latitudes, 
and dissolves back into the outer core at other latitudes, con¬ 
strained by magnetic forces. This effect may cause preferential 
alignment of iron crystals, and thus inner core anisotropy 
(Section 3.6.6). Seismological studies and experimental and 
theoretical studies of materials at high pressures and temper¬ 
atures are being used to investigate these issues. 

3.8.6 Seismology and planetary evolution 

We have seen in this section that seismology gives a snapshot of 
the present stage of the earth’s thermal and chemical evolution. 
Seismology shows the present thickness of the lithosphere, 
which may have increased with time, and provides much of 
our information about plate tectonic processes and mantle 
convection. Seismology similarly provides most of what we 
know about the core, including the present sizes of the inner 
and outer cores that reflect the progressive freezing of the solid 
inner core from the liquid outer core. Hence, as shown in 
Fig. 3.8-15, the core has been cooling with time, causing the 
inner core to grow. 

What we know about the earth and our more limited know¬ 
ledge of the moon and other planets suggest that although there 
are differences among the inner planets that reflect their initial 
compositions, there are also similarities in their evolution. As 
shown in Fig. 3.8-16, planets may follow a similar life cycle, 
with phases including their formation, early convection and 
core formation, plate tectonics, terminal volcanism, and quies¬ 
cence. This evolution is driven by the available energy sources 

11 This effect in which the composition of the liquid and the solid differ is called 
fractional crystallization and has many geological applications, including formation 
of rocks from a cooling magma. It can be illustrated with partially frozen apple juice, 
where the liquid tastes sweeter because it is enriched in sugar relative to the solid 
fraction. 



Fig. 3.8-15 Evolution of the core geotherm, assuming that the solidus is 
continuous between the inner and outer cores. Early in the earth’s history 
the core geotherm (dashed line) was everywhere greater than the solidus, 
making the whole core molten. As the core cooled, the geotherm lowered, 
causing the growth of a frozen inner core. The ICB is the current 
intersection between the geotherm and the solidus. (After Stacey, 1992. 
From Physics of the Earth , 3rd edn, copyright © 1992 by John Wiley & 
Sons, Inc. (New York). Reprinted by permission.) 

and reflects the planets’ cooling with time. Thus, even though 
the planets formed at about the same time, they are at different 
stages in their life cycles. 12 The earth is in its middle age, char¬ 
acterized by active plate tectonics. 

Hence the approaches used to study the earth’s interior can 
be applied to other planets. A five-station seismological net¬ 
work deployed on the moon by the Apollo missions found a 
very low level of seismicity, of which most reflected meteoroid 
impacts or small moonquakes generated by tidal forces. Travel 
time studies yielded the velocity profile shown in Fig. 3.8-17, 
which has considerable uncertainty owing to the small number 
of seismometers and the difficulty of identifying arrivals due 
to scattering (Fig. 3.7-10). Various interpretations have been 
made of these results. Although it is tempting to correlate the 
low-velocity zone with an asthenosphere, thermal models pre¬ 
dict that this region would be too cold. As a result, the zonation 
of the mantle is thought to represent compositional differences. 

12 Consider a human and dog born on the same date. 
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Fig. 3.8-16 A model for the evolution of 
the terrestrial planets, showing the energy 
sources at each stage, presented in text 
and verse by Kaula (1975). (© 1975 by 
Academic Press, reproduced by permission 
of the publisher.) 
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Fig. 3.8-17 Velocity model from lunar seismic data {left) and a possible compositional interpretation [right). Squares show seismometer locations, 
arrowheads show major meteoroid impacts, and large and small dots denote shallow and deep moonquakes. (After Nakamura, 1983 (/. Geophys. Res., 
88, 677-86, copyright by the American Geophysical Union) and Hubbard, 1984.) 


There is a suggestion of decreased velocity below 1000 km, 
which thermal models suggest could be consistent with an 
asthenosphere. Seismological efforts to detect a core are incon¬ 
clusive, and the moment of inertia ratio of 0.39 allows for at 
most a small core. 

Hence it appears that the moon now has a thick lithosphere 
and is tectonically inactive. It thus seems to have lost much of 
its heat, presumably because of its small size, which favors 
rapid heat loss. In general, we would expect the heat available 


from the gravitational energy of accretion and radioactivity to 
increase as the planet’s volume, whereas the rate of heat loss 
through the surface should depend on its surface area. Hence 
the remaining heat should vary as 

. . available (4/3)7tr 3 r /1Q , 

remaining heat = —--=-— = —, 

loss 4nr l 3 

so larger planets would retain more heat and be more active. 
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From such arguments, we might expect Mercury and Mars, 
which are larger than the moon but smaller than the earth, 
to have also reached their old age with little further active 
tectonics. Mercury may still have a small liquid core, which 
contributes to the observed magnetic field, due to tidal forces 
from the sun. Venus, which is comparable in size to the earth, 
might still be active but with episodic, rather than continuous, 
plate tectonics. Seismology can contribute little to the active 
discussion of these topics until seismometers are deployed on 
these planets. Although only one seismometer has been oper¬ 
ated on Mars and yielded inconclusive results, 13 seismometers 
are planned for future missions. 

Further reading 

Refraction seismology and its use in crustal studies are covered in many 
general geophysics texts, such as Fowler (1990) and Reynolds (1997). 
More detailed treatments can be found in exploration textbooks like 
Dobrin and Savit (1988), Sheriff and Geldart (1982), Telford etal. (1976), 
and Kearey and Brooks (1984). Additional information can be obtained 
in review papers such as Braile and Smith (1975), Kennett (1977), or 
Spudich and Orcutt (1980). A summary of crustal structure results and 
interpretations for the continental USA can be found in Pakiser and 
Mooney (1989). Meissner (1986) presents an integrated treatment of 
observations and models for the continental crust. Reviews on the nature 
of the Mohorovicic discontinuity are given by Jarchow and Thompson 
(1989), Braile and Chiang (1986), and Fountain and Christensen (1989). 

13 Due to operational constraints, the seismometer was mounted on the lander 
portion of the spacecraft, rather than in direct contact with Mars. It is rumored that 
consideration was given to saving weight on the lander by moving the seismometer to 
the orbiter. 


Gibson and Levander (1988) discuss posible artifacts in lower crustal re¬ 
flection data. 

The extensive literature on reflection seismology includes the intro¬ 
ductory exploration texts listed above and advanced treatments, including 
Claerbout (1976, 1985), Robinson and Treitel (1980), Waters (1981), 
Sheriff and Geldart (1982), Robinson (1983), and Yilmaz (1987). The sub¬ 
ject is closely allied to that of geophysical signal processing, discussed in 
texts including Kanasewich (1981) and Hatton etal. {1986). 

Applications of seismology to earth structure are discussed in texts and 
the research literature. Introductory texts such as Bolt (1982), Bott (1982), 
Gubbins (1990), Doyle (1995), Lay and Wallace (1995), Lowrie (1997), 
Shearer (1999), and Udias (1999), have good overviews. Simon (1981) is 
a manual for seismogram interpretation, showing examples of records 
for earthquakes at various distances and depths. The classic texts by 
Gutenberg (1959) and Jeffreys (1976) are excellent starting points for 
further treatment of the data and methods. Bullen and Bolt (1985) has a 
detailed discussion of ray theory for the spherical earth. Aki and Richards 
(1980), Ben-Menahem and Singh (1981), and Kennett (1983) treat both 
ray theory and more advanced methods. The normal mode simulation 
of body wave propagation shown in Fig. 3.5-19 is available at http:// 
epsc.wustl.edu/seismology/michael/movie.html. 

Karato and Spetzler (1990) review the physical mechanisms causing 
anelasticity. Information about anisotropy can be found in Babuska and 
Cara (1991) and Silver (1996). For discussion of scattering and attenua¬ 
tion, see Kanamori and Anderson (1977), Brennan and Smylie (1981), 
Jackson (1993), Mitchell (1995), Sato and Fehler (1998), and Romanowicz 
(1998). Garnero (2000) summarizes results for the lateral heterogeneity of 
the lowermost mantle. 

We alluded only briefly to the nonseismological geophysical data and to 
chemical results applicable to study of the earth’s interior. In addition to 
journal articles, useful texts are those by Wyllie (1971), Bullen (1975), 
Ringwood (1975), Wood and Fraser (1977), Brown and Mussett (1993), 
Bott (1982), Melchior (1986), Jacobs (1987), Lambeck (1988), Anderson 
(1989), Stacey (1992), and Poirier (2000). Useful reviews can be found in 
McElhinny (1979), Ahrens (1995a, b, c), Boschi et al. (1996), Boehler 
(1996), Crossley (1997), Gurnis etal. (1998), and Davies (1999), 


l— Problems 



1. Use the data from the refraction experiment in Fig. 3.2-5 to find the 
crust and mantle velocities and the crustal thickness. Remember 
that this is a reduced travel time plot. 

2. For a case of two layers overlying a halfspace, derive an expression 
for the thickness of the second (deeper) layer in terms of the second 
crossover distance. 

3. Analyze the data from the marine refraction experiment (Lewis, 
1978) shown in Fig. P3.1, assuming for simplicity that the structure 
consists of a water layer, a crustal layer, and a mantle halfspace. 

(a) Assuming that the first arrivals are described by two line 
segments, for head waves at the top of the crust and mantle, 
find the corresponding velocities. 

(b) Although the direct wave traveling in the water layer is not 
shown, the P velocity for water is 1.5 km/s. Use the time 
intercept for the crustal head wave to find the water depth. 

(c) Use the time intercept for the P n wave to find the crustal 
thickness. 

4. To show that the head wave is predicted by Fermat’s principle, 
consider a layer of thickness h with velocity v Q , overlying a 
halfspace with a higher velocity, v 1 . 


(a) Derive the travel time to distance x for a wave that is incident 
on the boundary at a distance y from the source, travels 
for some distance just below the boundary, and then returns 
to the surface at the same incidence angle at which it went 
down. 

(b) Find the y value giving an extremal travel time, and show 
that it corresponds to the critical angle of incidence. 

(c) Determine if this travel time is a minimum or a maximum. 

5. Use the data for the reversed profile shown in Fig. P3.2 to find the 
crust and mantle velocities, Moho dip, and crustal thickness. 

6. (a) Derive the travel time for the head wave on the up-dip path of a 

reversed profile with a dipping layer (Eqn 3.2.17). 

(b) Show that the equations for the travel time of the head wave 
for a dipping layer (Eqns 3.2.16 and 3.2.17) reduce to the flat 
layer result in the case of zero dip. 

7. Derive the Dix equation for interval velocity (Eqn 3.3.19) from the 
formula for rms velocity. 

8. Consider two pairs of seismograms. One pair have the same mid¬ 
point, but the offset for one record is the negative of the first. The 
other pair have the same source point, but the offset for one record 
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Fig. P3.2 See problem 5. 

is the negative of the first. Sketch the ray paths for a single dipping 
layer, and explain which have the same travel times and why. 

9. Define the cross-correlation (Eqn 3.3.68) for discrete time series. 
Such a series with N points can be written f(t) = f(nAt), where n 
goes from 0 to N - 1 and At is the time increment between points. 

10. Given a common offset gather, what can you tell about structure 
along a profile? 

11. Assume that a 24-fold seismic survey records data sampled every 
40 milliseconds, and that each trace is 10 s long. For a source spac¬ 
ing of 25 m, how many data points are recorded in a 100 km-long 
survey? 

12. Given the definition of the travel time curve for a spherical earth 
T{p)=pA{p) + r(p), prove that dt/dp = -A(p). 

13. (a) Use the travel times for PcP and PKiKP at vertical incidence 

(Fig. 3.5-4) to estimate the average P-wave velocity in the outer 
core. 

(b) Use the travel times for PKiKP and PKIKP at vertical incidence 
(Figs. 3.5-4 and 3.5-7 to estimate the average P-wave velocity 
in the inner core. 

14. Compare the travel time curves (Fig. 3.5-4) for earthquakes at the 
surface and at a depth of 600 km. Identify and explain some of 
the differences. 


21:09:00 Station BAG: component LPZ: mag 1500 
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Fig. P3.3 See problem 16. 

15. Use the travel time curves (Fig. 3.5-4) for earthquakes at the 
surface and at a depth of 600 km to find p in s/degree for direct P 
waves at 40° and 60°. Find the angle of incidence at the earthquake 
for these rays by converting p to s/radian and using the velocities in 
Fig. 3.5-1. Explain how the angle of incidence of rays reaching a 
given distance depends on earthquake focal depth. 

16. The seismogram in Fig. P3.3 for July 21, 1964, at Baguio 
(Philippines) contains arrivals from an earthquake that occurred in 
the Solomon Islands at 21 hours, 1 minute, 50 seconds. To analyze 
these data, which may be easier on an enlarged photocopy, 

(a) Measure the arrival time of the P wave and use the earth¬ 
quake origin time to find its travel time. 

(b) Use the travel time curves to find how far from the station 
the earthquake occurred. 

(c) Trace the first 8 minutes of the seismogram after the P wave. 
Identify the S and PP phases on your tracing (use the travel 
time table for help). Can you identify other phases? 

(d) Identify the free surface reflections pP and sP. Measure their 
times after P, and use these times to estimate the depth. 

17. The travel time curve for P di ^ the P wave diffracted along the core¬ 
mantle boundary, conveys information about the velocity at the 
base of the mantle. The travel time curve is linear, with apparent 
velocity p = dT/dA = r cmb /v cmh , where r cmb is the radius of the core¬ 
mantle boundary and v cmb is the velocity at the base of the mantle. 

(a) Measure the apparent velocity in s/degree from the record 
section in Fig. P3.4, and compare it to the slope of the travel 
time curve in Fig. 3.5-4. 

(b) Convert p to s/radian, and find the velocity at the base of the 
mantle. 

(c) Imagine a location near the base of the mantle that is 180° 
away from an earthquake. The first SH wave to reach that 
spot will be SH di ff. What is the first SV wave (of nonzero 
amplitude) to reach that spot? 
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Fig. P3.4 See problem 17. 


18. Derive R{t) and T(t) in Eqn 3.6.13. 

19. (a) Use Table 2.9-1 to find the attenuation relaxation times for 

modes 0 T 2 , 0 T 30 , and 0 S 30 if their Q values are 250, 130, and 
183, respectively. 

(b) How far have the Love and Rayleigh waves corresponding to 
0 T 30 and 0 S 30 traveled during these times? 

20. Show that for a damped harmonic oscillator, the quality factor 
(9 = 2ttEI{-AE), where E is the energy in the oscillating system, and 
A £ is the amount of energy lost during one cycle of the oscillation. 

21. Find the percentage shear wave velocity differences due to physical 
dispersion between waves with periods of 1 and 10 s in the case of 

(a) a hot back-arc basin (Q = 25), 

(b) a cold lithospheric slab (Q - 250). 

Explain physically what causes the difference between the results 
for parts (a) and (b). 

22. Show that a 2 - (4 I3)(3 2 = Kip and that Qr£ = (1 - L)Q ~£ 

where L = (4/3)((3/a) 2 

23. Use the acceleration of gravity at the core-mantle boundary 
(g = 10.7 m/s 2 ) to find the total mass and average density of earth’s 
core. 

24. Assuming that the earth is ellipsoidal, but otherwise homogeneous: 

(a) What source location results in the greatest amount of 
antipodal defocusing of surface waves? 

(b) What source location results in the least amount of 
antipodal defocusing of surface waves? 

(c) For (a), estimate the approximate time range for the earliest 
and latest arrival of a surface wave with a phase velocity of 
4.0 km/s. 

Computer problems 

C-l. Write a program to trace direct, reflected, and head wave paths 
for a dipping layer over a halfspace. Have the program compute 
the travel time for each path from the length of the path in each 
material (i.e., rather than using the analytic expressions for travel 
time). Use the program to replicate the results of problem 5. 


C-2. Write a program to generate and plot travel times for reflections 
from a series of flat interfaces using both the expressions for 
travel time and distance (Eqns 3.3.7, 3.3.8) and their hyperbolic 
approximation (Eqn 3.3.11). Calculate the travel times for the 
oceanic crust model given in Fig. 3.2-15. Compare the results 
from the two methods. 

C-3. (a) Write a subroutine to calculate the cross-correlation of two 
time series sampled at discrete times. 

(b) Write a subroutine to calculate a Vibroseis sweep signal 
(Eqn 3.3.57) of a given length, T, and frequency range (f v f 2 ). 
The start time, t Q , and sample rate, At, should also be 
parameters. 

(c) Generate and plot a sweep for At = 0.0025 s, t Q - 0, T = 5 s, 
f 1 ~7 Hz, and f 2 = 14 Hz. Use the results of part (a) to find 
and plot its auto-correlation. 

C-4. (a) Write a subroutine to generate a reflector series (Eqn 3.3.61) 
for a series of layers with thicknesses fi-, velocities and 
densities p ( -. 

(b) Calculate and plot the results for two layers over a half¬ 
space, if the first layer is 3 km thick, with v = 2.5 km/s and 
p = 2.1 g/cm 3 , the second is 4 km thick with v = 3.2 km/s 
and p = 2.4 g/cm 3 , and the halfspace has v = 4.5 km/s and 
p- 2.8 g/cm 3 . 

(c) Calculate and plot the vertical incidence synthetic seismo¬ 
gram for this structure and the source given in the previous 
problem. 

(d) Using the results of problem C-3, cross-correlate the 
seismogram with the sweep and plot the resulting time 
series. 

(e) Repeat parts (b)-(d), cutting the second layer thickness in half 
each time. When can you no longer resolve the second layer 
on the time series after cross-correlation? 

C-5. (a) Write a program which takes a source at any depth and traces 
rays, selected by a range of incidence angles at the source, 
through an earth model. Have the graphic output show the 
source, ray paths, earth’s surface, core-mantle boundary, and 
inner core-outer core boundary. 

(b) Using PREM or another earth model, trace rays for sources at 
the surface and at 300 km depth. Have the ray paths show the 
effects of upper mantle discontinuities and the core. 

(c) Have the program produce a travel time plot. Can you resolve 
the upper mantle discontinuities in this plot? 

C-6. (a) Write a program that computes the mass, M, moment of 
inertia about the polar axis, C, and C/Ma 2 ratio for a planet 
of radius a. To do this, treat the planet as a series of n shells 
whose densities you input. 

(b) Determine models for the densities of 



a 

M 

C/Ma 2 

earth 

6371 km 

5.977 xIO 24 kg 

0.331 

moon 

1738 km 

7.352 xIO 22 kg 

0.395 

Mars 

3390 km 

6.419 xIO 23 kg 

0.365 


that satisfy the observed M and C/Ma 2 . Can you satisfy the 
data for Mars and the earth without a dense core? 





Earthquakes 


Much of what is known about earthquakes follows from study of the motion of the ground, 

Charles Richter, Elementary Seismology , 1958 


4.1 Introduction 

Seismology deals with the generation and propagation of seis¬ 
mic waves. Our initial focus has been on the propagation of 
seismic waves and how they can be used to study the interior of 
the earth. We now turn to the generation of seismic waves and 
how they are used to study earthquakes. This association is so 
strong that seismology is sometimes viewed as the science of 
earthquakes, rather than of elastic waves in the earth. Both 
definitions are used, but the latter has become more common 
because seismology is the primary tool used to investigate earth 
structure as well as earthquakes, whereas techniques other 
than elastic waves are also used to investigate earthquakes. 

Earthquakes almost invariably occur on faults , surfaces in 
the earth on which one side moves with respect to the other. 
Typically, earthquakes occur on faults previously identified by 
geological mapping, which shows that motion across the fault 
has occurred in the past. Earthquakes that occur on land and 
close enough to the surface often leave visible ground breakage 
along the fault. For example, earthquakes occur along the San 
Andreas fault, which can be seen cutting across California 
for great distances (Fig. 4.1-1). One of these, the famous 1906 
magnitude 7.8 San Francisco earthquake on the San Andreas 
fault was one of the first earthquakes to be studied carefully. 
Contemporary accounts showed that several meters of relative 
motion occurred along several hundred kilometers of the San 
Andreas fault (Fig. 4.1-2). 

The earthquake and the resultant fires did such damage 
(Fig. 1.2-10) that a study commission was formed. As part of 
the investigation, H. Reid proposed the elastic rebound theory 
of earthquakes on a fault. In this model, materials at distance 
on opposite sides of the fault move relative to each other, but 
friction on the fault “locks” it and prevents the sides from slip¬ 
ping (Fig. 4.1-3). Eventually the strain accumulated in the rock 
is more than the rocks on the fault can withstand, and the fault 



Fig. 4.1-1 Aerial photograph of the San Andreas fault in the Carrizo Plain 
in California, seen from the south. Note the displacement of stream gullies 
as the Pacific plate ( near side) has moved to the left (northwest) relative to 
North America. (Copyright John S. Shelton.) 


slips, resulting in an earthquake. The motion illustrated in 
this cartoon by an offset fence can sometimes be seen after 
earthquakes using other linear features, including rows of 
trees, railroad tracks, or roads (Fig. 4.1-4). 

The elastic rebound idea was a major conceptual break¬ 
through, because the faulting seen at the surface had been pre¬ 
viously regarded as an incidental side effect of an earthquake, 
rather than its cause. Subsequently, earthquake studies have 
been widely pursued for several reasons. One is to understand 
the large-scale geological processes causing earthquakes. It 
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Fig. 4.1-2 Map of the portion of the San Andreas fault that slipped in 
the 1906 San Francisco earthquake {top) and the amount of surface slip 
reported at various points along it {bottom). This slip is the distance by 
which the earthquake displaced originally adjacent features on opposite 
sides of the fault. (Boore, 1977. © Seismological Society of America. 

All rights reserved.) 




Fig. 4.1-3 The elastic rebound model of earthquakes assumes that 
between earthquakes, material on the two sides of a fault undergoes 
relative motion. Because the fault is locked, features across it that were 
linear at time (a), such as a fence, are slowly deformed with time (b). 
Finally the strain becomes so great that the fault breaks in an earthquake, 
offsetting the features (time c). (Courtesy of S. Wesnousky.) 



Fig. 4.1-4 Displacement of crop rows resulting from slip along the 
Imperial fault, El Centro, California, on October 15,1979. (Courtesy of 
the National Geophysical Data Center.) 


turns out that earthquakes largely reflect the motions of 
lithospheric plates, and so provide valuable information about 
how and why plates move. For example, earthquakes on the 
San Andreas fault result from the steady motion between the 


North American and Pacific plates (Fig. 5.2-3). A second 
reason is to understand the fundamental physics of earthquake 
faulting. There are many unanswered questions about how 
and when faults break, even for earthquakes that occur near 
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the earth’s surface, where data are relatively easy to gather. 
These issues are important for society because, as discussed in 
Chapter 1, knowledge of where and when earthquakes are 
likely, and of the expected ground motion during them, can 
help mitigate the risk they pose. 

The largest earthquakes typically occur at plate boundaries. 
Using elastic rebound theory, we think of them as reflecting the 
most dramatic part of a process called the seismic cycle , which 
takes place on segments of the plate boundary over hundreds to 
thousands of years. During the interseismic stage, which makes 
up most of the cycle, steady motion occurs away from the fault 
but the fault itself is “locked,” although some aseismic creep 
can also occur on it. Immediately prior to rupture there is the 
preseismic stage that can be associated with small earthquakes 
(foreshocks) or other possible precursory effects. The earth¬ 
quake itself marks the coseismic phase during which rapid 
motion on the fault generates seismic waves. During these few 
seconds, meters of slip on the fault “catch up” with the few 
mm/yr of motion that occurred over hundreds of years away 
from the fault. Finally, a postseismic phase occurs after the 
earthquake, and aftershocks and transient afterslip occur for a 
period of years before the fault settles into its steady inter¬ 
seismic behavior again. 

Studying this cycle is difficult because it extends for hundreds 
of years, so we do not have observations of it in any one place. 
Instead, we have observations from different places, which we 
assume can be combined to give a complete view of the process. 
It is far from clear how good that view is and how well our 
models represent its complexity. As a result, earthquake physics 
remains an active research area that integrates a variety of 
techniques. Most faults are identified from the earthquakes on 
them, and seismology is the primary tool used to study the 
motion during the earthquakes and infer the long-term nature 
of motion on the faults. Moreover, because earthquakes are 
such dramatic events, historical records of earthquakes are 
often available and provide data on the earthquake cycle for 
a given fault or fault segment. Field studies, both on land and 
under water, also provide information about the location, 
geometry, and history of faults. Geodetic measurements are 
used to study ground deformation before, during, and after 
earthquakes, and thus the processes associated with fault lock¬ 
ing and afterslip. For oceanic regions and deep earthquakes, 
where geodetic and geological observations are not available, 
almost all of what we know about the earthquakes themselves 
comes from seismology. The results for individual earthquakes 
are then combined and integrated with those from other tech¬ 
niques, as discussed in the next chapter, to better understand 
how earthquakes in a given region reflect the large-scale tectonic 
processes that cause them. 

Of these approaches, our primary focus in this book is the 
information that seismology provides about earthquakes. The 
arrival time of seismic waves at seismometers at different sites 
is first used to find the location of an earthquake, known as the 
focus , or hypocenter , using techniques discussed in Chapter 7. 
Next, as discussed in this chapter, the amplitudes and shapes of 


the radiated seismic waves are used to study the size of the 
earthquake, the geometry of the fault on which it occurred, 
and the direction and amount of slip. We introduce these 
techniques and discuss their applications, while leaving their 
derivation and details for more advanced treatments listed at 
the chapter’s end. 

It is worth bearing in mind that learning about earthquake 
faulting from the seismic waves that are generated is an inverse 
problem, like learning about earth structure from seismic 
waves. As discussed in Section 1.1.2, this means that studying 
seismic waves alone is limited in what it can tell about the 
earthquake process. We will see that the seismic waves radiated 
from an earthquake reflect the geometry of the fault and the 
motion on it, and so can give an excellent picture of the kin¬ 
ematics of faulting. However, they contain much less informa¬ 
tion about the actual physics, or dynamics , of faulting. In the 
next chapter, we discuss how seismological results are being 
combined with experimental and theoretical studies of rock 
friction and fracture to explore the physics of earthquakes. 

4.2 Focal mechanisms 

4.2.1 Fault geometry 

To describe the geometry of a fault, we assume that the fault 
is a planar surface across which relative motion occurred 
during an earthquake. Geological observations of faults that 
reach the surface show that this is often approximately the case 
(Fig. 4.2-1), although complexities are common. Similarly, 
we will see that this assumption is usually (but not always) 


j 



Fig. 4.2-1 Fault cutting across a moraine near Crowley Lake, California. 
The land in front has dropped relative to the background. (Copyright 
John S. Shelton.) 
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Fig. 4.2-2 Fault geometry used in earthquake studies. The fault plane, 
with normal vector n, separates the lower, or foot wall, block from the 
upper hanging wall block (not shown). The slip vector, d, describes the 
motion of the hanging wall block with respect to the foot wall block. 

The coordinate axes are chosen with x 3 vertical and x 1 oriented along 
the fault in the plane of the earth’s surface, such that the fault dip angle, 8, 
measured from the -x 2 axis, is less than 90°. The slip angle X is measured 
between the x 1 axis and d in the fault plane, is the strike of the fault 
measured clockwise from north. (After Kanamori and Cipar, 1974. Phys. 
Earth Planet. Inter., 9,128-36, with permission from Elsevier Science.) 


consistent with seismic data. Thus the fault geometry is 
described in terms of the orientation of the fault plane and the 
direction of slip along the plane. 

The geometry of this model is shown in Fig. 4.2-2. The fault 
plane is characterized by n, its normal vector . The direction of 
motion is given by d, the slip vector in the fault plane. The slip 
vector indicates the direction in which the upper side of the 
fault, known as the hanging wall block , moved with respect to 
the lower side, the foot wall block. Because the slip vector is in 
the fault plane, it is perpendicular to the normal vector. 

Several different coordinate systems are useful in studying 
faults. One is aligned such that the x 1 axis is in the fault strike 
direction, the intersection of the fault plane with the earth’s 
surface. The x 3 axis points upward, and the x 2 axis is perpen¬ 
dicular to the other two. The dip angle S gives the orientation 
of the fault plane with respect to the surface. Because the x 1 
axis could be defined in two directions, 180° apart, it is chosen 
so that the dip measured from the -x 2 axis is less than 90°. The 
direction of motion is represented by the slip angle, A, meas¬ 
ured counterclockwise in the fault plane from the x 1 direction, 
which gives the motion of the hanging wall block with respect 
to the foot wall block. To orient this system relative to the geo¬ 
graphic one, the fault strike is defined as the angle in the 
plane of the earth’s surface measured clockwise from north to 
the x 1 axis. 

Alternatively, the orientation of the fault and slip can be 
described by giving the normal and slip vectors in a geographic 
coordinate system with x pointing north, y pointing west, and z 
pointing up. In this coordinate system, the unit normal vector 
to the fault plane is 


n = 


-sin 8 sin 
-sin 8 cos (j)f 
cos 8 


( 1 ) 


and the slip vector, a unit vector in the slip direction, is 

cos X cos (j)f + sin X cos 8 sin (pf 
■cos X sin (pf + sin X cos 8 cos 
sin X sin 8 


( 2 ) 


These two different coordinate systems, (fy-, <5, X) and (n, d), 
are useful for different purposes. Some calculations are more 
easily done with respect to the fault, whereas others are more 
easily done with respect to geographic directions. 

Although the slip direction varies such that the slip angle 
ranges from 0° to 360°, several basic fault geometries, de¬ 
scribed by special values of the slip angle, are useful to bear in 
mind (Fig. 4.2-3). When the two sides of the fault slide horizon¬ 
tally by each other, pure strike-slip motion occurs. When X = 
0°, the hanging wall moves to the right, and the motion is called 
left-lateral. Similarly, for X = 180°, right-lateral motion occurs. 
To tell which is which, look across the fault and see which way 
the other side moved. The other basic fault geometries describe 
dip-slip motion. When X = 270°, the hanging wall slides down¬ 
ward, causing normal faulting. In the opposite case, X = 90°, 
and the hanging wall goes upward, yielding reverse , or thrust , 
faulting} Most earthquakes consist of some combination of 
these motions and have slip angles between these values. It is 
thus useful, when thinking about earthquake mechanisms, to 
remember the three basic faults. As discussed in Section 2.3.4, 
the basic fault types can be related to the orientations of the 
principal stress directions. 

This discussion brings out the point that although texts 
typically show vertically dipping strike-slip faults, 2 they are by 
no means the norm. In fact, as discussed later, the largest earth¬ 
quakes occur on shallow-dipping thrust faults at subduction 
zones. Although such faults are harder to study, because the 
fault trace is generally under water, the same basic principles 
apply. 

Real faults, of course, have finite dimensions and complic¬ 
ated geometries. If we treat a fault as rectangular, the dimension 
along the strike is called the fault length , and the dimension 
in the dip direction is known as the fault width. Actual earth¬ 
quake fault geometries can be much more complicated than a 
rectangle. The fault may curve and require a three-dimensional 
description. Rupture may occur over a long time and consist of 
several sub-events on different parts of the fault with different 
orientations. Such complicated seismic events, however, can 
be treated as a superposition of simple events. Thus, if we 


1 Seismologists often use the terms “reverse” and “thrust” fault interchangeably, 
whereas structural geologists reserve the term “thrust” for a shallow-dipping reverse 
fault. 

2 In part because many authors have spent time in California, and in part because 
they are easy to draw. 
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Left-lateral strike-slip fault Right-lateral strike-slip fault 

(A = 0°) (A =180°) 


Fig. 4.2-3 Basic types of faulting. Strike-slip 
motion can be either right- or left-lateral. 
Dip-slip faulting can occur as either reverse 
(thrust) or normal faulting. (Eakins, 1987.) 




Epicenter 


Fig. 4.2-4 First motions of P waves observed 
at seismometers located in various directions 
about the earthquake provide information 
about the fault orientation. The two nodal 
planes separate regions of compressional 
and dilatational first arrivals. One nodal 
plane is the fault plane, and the other is the 
auxiliary plane, but these data cannot 
distinguish which is the actual fault plane. 



Auxiliary 

plane 



understand the seismic waves generated by a simple, two- 
dimensional, rectangular fault, we can model those resulting 
from a more complicated set of ruptures. This application of the 
principle of superposition is based on the assumption of linear 
elasticity and is analogous to the way we constructed seismic 
waves by summing normal modes (Sections 2.2.5 and 2.9). 

4.2.2 First motions 

Seismograms recorded at various distances and azimuths are 
used to study the geometry of faulting during an earthquake, 
known as the focal mechanism. This operation uses the fact 
that the pattern of radiated seismic waves depends on the 
fault geometry. The simplest method, which we discuss first, 
relies on the first motion, or polarity, of body waves. More 
sophisticated techniques, discussed in the next section, use the 
waveforms of body and surface waves. 

The basic idea is that the polarity (direction) of the first 
P-wave arrival varies between seismic stations at different 


directions from an earthquake. Figure 4.2-4 illustrates this 
concept for a strike-slip earthquake on a vertical fault. The first 
motion is either compression , for stations located such that 
material near the fault moves “toward” the station, or dilata¬ 
tion , where the motion is “away from” the station. Thus when 
a P wave arrives at a seismometer from below, a vertical- 
component seismogram records an upward or downward first 
motion, corresponding to either compression or dilatation. 

The first motions define four quadrants, two compressional 
and two dilatational. The division between quadrants occurs 
along the fault plane and a plane perpendicular to it. In these 
directions, because the first motion changes from dilatation to 
compression, seismograms show small or zero first motions. 
These perpendicular planes, called nodal planes , separate the 
compressional and dilatational quadrants. If these planes can 
be found, the fault geometry is known. A problem is that the 
first motions from slip on the actual fault plane and from slip 
on the plane perpendicular to it, the auxiliary plane , would be 
the same, so the first motions alone cannot resolve which plane 
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Fig. 4.2-5 A fault-oriented coordinate system for 
describing the radiation pattern of an earthquake. 
The body forces equivalent to the faulting are a pair 
of force couples acting about the null axis. (After 
Pearce, 1977.) 


is the actual fault plane. However, additional information 
can often settle the question. Sometimes geologic or geodetic 
information, such as the trend of a known fault or observations 
of ground motion, indicates the fault plane. Often, smaller after¬ 
shocks following the earthquake occur on, and thus delineate, 
the fault plane. If the earthquake is large enough, the finite time 
required for slip to progress along the fault causes variations in 
the waveforms observed at different directions from the fault, 
so these directivity effects can be used to infer the fault plane. 

4.2.3 Body wave radiation patterns 

The radiation patterns of P and S waves, which we will not 
derive, can be obtained using the theory of seismic sources. The 
radiation patterns turn out to be those that would be generated 
by a set of forces with a corresponding geometry. Specifically, 
the radiation due to motion on the fault plane is what would 
occur for a pair of force couples , pairs of forces with opposite 
direction a small distance apart. If one couple was oriented 
in the slip direction with forces on opposite sides of the fault 
plane, the other couple would be oriented in the corresponding 
direction on opposite sides of the auxiliary plane. Thus the 
elastic radiation can be described as resulting from a double 
couple , and these forces are known as the equivalent body 
forces for the fault slip, discussed further in Section 4.4. 

It is important to bear in mind that the equivalent forces are 
only a simple model representing the complex faulting process 
that actually took place. We can view the faulting as occurring 
within a “black box” about which the radiated seismic waves 
provide only limited information. The seismic waves tell us 
only that some processes within the box produced seismic 
waves described by the equivalent forces. Often we have other 


geological and geophysical data, together with (hopefully 
valid) preconceptions, about the source. In particular, we often 
(at least believe that we) have good reasons to favor slip on one 
of the possible fault planes and to interpret the faulting in terms 
of the regional geology and stress field. Similarly, we interpret 
aspects of the seismic wave field in terms of simple models 
of the physics of the faulting process, while recognizing that 
radiated seismic waves provide only a partial picture. 

The radiation patterns of double couples have natural sym¬ 
metries about the fault plane, and are thus normally written 
using a coordinate system oriented along the fault. In such a 
system (Fig. 4.2-5), the fault plane lies in the x t -x 2 plane, so its 
normal is the x 3 axis. The slip vector is in the fault plane, paral¬ 
lel to the Xj axis. The slip is such that material above the x 1 —x 2 
plane moves in the +x 3 direction with respect to the material on 
the other side. The radiation pattern would be the same if the 
slip in the x 3 direction occurred on the auxiliary plane, which 
lies in the x 2 -x 3 plane and whose normal is the x 1 axis. Thus 
we can interchange the slip (xf) and normal (x 3 ) directions, 
so the slip vector on one plane is the normal vector on the 
other, and vice versa. However, the direction orthogonal to 
both, known as the null axis , is distinct. In this geometry, the 
equivalent body force double couple acts about the x 2 axis, and 
the forces are oriented along the x 1 and x 3 directions. 

To see how the radiation patterns vary with the direction of 
the receiver, consider the radiation field in spherical coordin¬ 
ates, where 0 is measured from the x 3 axis and 0 is measured 
in the x 1 ~x 2 plane (Figs 4.2-6 and 7). Seismic source theory 
shows that far from the source, the displacement due to com- 
pressional waves, which create the radial (e ) component of the 
displacement (u r ) because their motion is along the propaga¬ 
tion direction, is 
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Fig. 4.2-6 The body wave radiation pattern for a double couple source has 
symmetry in the spherical coordinate system shown, corresponding to the 
axes in Fig. 4.2-5. Pis measured from the x 3 axis, the normal to the fault 
(x 1 -x 2 ) plane, and <p is measured in the fault plane. The P-wave radiation 
pattern has four lobes that go to zero at the nodal planes, which are the 
fault and auxiliary {x 2 ~x 3 ) planes. The 5-wave radiation pattern describes 
a vector displacement that does not have nodal planes but is perpendicular 
to the P-wave nodal planes. S-wave motion converges toward the T axis, 
diverges from the P axis, and is zero on the null axis. (After Pearce, 1977, 
1980.) 


Ur =- M{t-r/a) sin 20 cos (f. (3) 

4 npa 3 r 

This expression has several parts. The first term is an ampli¬ 
tude term. In an infinite medium, for which this was derived, 




(a) 



Fig. 4.2-7 Radiation amplitude patterns of P and 5 waves in the x 1 -x 3 
plane, a: Fault geometry, showing the symmetry of the double couple 
about the x 2 axis, b: Radiation pattern for P waves, showing the amplitude 
{left) and direction {right), c: Same as (b), but for 5 waves. 

the amplitude would decay as Hr. The second term reflects the 
pulse radiated from the fault, M{t), which propagates away at 
the P-wave speed a and arrives at a distance r at time t~rla. 
M{t) is called the seismic moment rate function or source 
time function. It is the time derivative of the seismic moment 
function 

M{t) = jiD(t)S{t), (4) 

which describes the faulting process in terms of the rigidity of 
the material and history of the slip D{t) and fault area S{t). The 
latter terms are time-dependent, because they can vary during 
an earthquake. As discussed in Section 4.6, the best measure 
of earthquake size and energy release is the static (or scalar) 
seismic moment 

M 0 = jdDS , (5) 
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where D is the average slip (or dislocation) on the fault with 
area 5. We often use the seismic moment as a scale factor and 
write M(t)~ M Q x(t ), where x(t) is the source time function. 

The final term, sin 20 cos 0, describes the P-wave radiation 
pattern. It is four-lobed, with two positive, compressional, 
lobes and two negative, dilatational, ones. The displacement is 
zero on the fault {6= 90°) and auxiliary (0=90°) planes. Thus 
the fault plane and auxiliary plane are nodal planes separat¬ 
ing compressional and dilatational quadrants. The maximum 
amplitudes are between the two nodal planes. 

Similarly, the shear wave displacement has two components, 
u e ^9 + u fyy where 

1 

u e =-— M{t~r//3 ) cos 20 cos 0, 

4 npp 5 r 

1 

u 0 ~ “— —M{t-r//3){- cos 0sin 0). (6) 

4 7tp/rr 

Note that the term involving M(t) corresponds to waves pro¬ 
pagating at the 5-wave speed /?. As shown in Fig. 4.2-6, the 
5-wave motion does not have nodal planes, but it is perpen¬ 
dicular to the P-wave nodal planes and is zero on the null axis. 
It converges toward the center of the compressional quadrants, 
which, as we will see shortly, is the location of the T, or least 
compressive stress, axis. It also diverges from the centers of the 
dilatation quadrants, known as the P, or most compressive 
stress, axis. Thus, although the 5-wave pattern does not reflect 
the fault plane as clearly as the P-wave pattern, it can also be 
used to study the fault geometry. An interesting feature of 
Eqns 3 and 6 is that they show why 5 waves on seismograms are 
usually bigger than P waves — the equations predict an average 
ratio of a 3 Ip 3 , or about 5. 

Because the radiated seismic waves vary as a function of 
0 and 0, seismograms recorded at different directions from 
the earthquake can be used to find the fault geometry. The 
P wave is the first wave to arrive from an earthquake, so on a 
seismogram it is an isolated arrival whose polarity is often easy 
to identify. A set of P-wave first motions thus often makes 
it possible to locate the nodal planes that divide the regions of 
different polarity. The first 5 waves are harder to use, because 
they arrive later in the seismogram and can be buried in a com¬ 
plicated wave train. It is still possible, however, to use the 5- 
wave information. One way to do this is to consider the relative 
amplitudes of the two 5-wave components. 

One additional concept is needed to determine fault plane 
solutions using the first motions from various seismic stations. 
The radiation patterns show the displacements that would 
occur on a sphere with infinitesimal radius about the source. 
The observations, of course, are at stations some finite distance 
from the source. We thus need to convert the observations 
at the stations to hypothetical ones surrounding the source. 
To do this, recall that seismic waves do not travel in straight 
lines from the earthquake to a station. Instead, because seismic 
velocities vary with depth, rays follow curved paths. 



Fig. 4.2-8 The angle of incidence at the earthquake source is the angle 
from the vertical at which the ray leaves the source, and thus the angle at 
which the ray intersects the lower focal hemisphere. 


As discussed in Section 3.4, the ray paths are given by Snell’s 
law, which says that the ray parameter is constant along a ray. 
Thus the ray parameter of the ray arriving at a given distance 
can be found from the slope of the travel time curve T(A), 


r sin i dT 



Hence taking r as the radius at the earthquake source and v 
as the velocity at the source depth, the value of dTIdh for this 
distance gives the ray’s angle of incidence at the source, often 
called the take-off angle. How far a ray travels depends on its 
take-off angle (Fig. 4.2-8); rays with large take-off angles leave 
the source closer to the horizontal and travel shorter distances 
than those with smaller take-off angles. 

The distance that a ray has traveled thus gives its take-off 
angle. Table 4.2-1 is a sample table relating teleseismic travel 
distances and take-off angles for P waves from a surface-focus 
earthquake. These distances and angles depend on the velocity 
model assumed. In teleseismic first motion studies, stations 
at distances greater than 100° are generally not used because 
the rays hit the earth’s core, and stations for distances closer 
than 30 are often avoided because the take-off angles depend 
strongly on the upper mantle velocity structure used. In local 
earthquake studies, care is taken to ensure that the velocity 
model is appropriate. 

Using such tables, the distances to seismic stations can be 
converted to take-off angles. Thus the locations of compres¬ 
sions and dilatations can be converted to their positions on 
the surface of the lower focal hemisphere , a hemisphere with 
infinitesimal radius about the source. A similar approach can 
be used for data directly above a deep earthquake, where the 
upper focal hemisphere is a natural representation. 

4.2.4 Stereographic fault plane representation 

We have seen that fault geometry can be found from the distri¬ 
bution of data on a sphere around the focus. Because plotting 
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on a piece of paper is easier than plotting on a sphere, a stereo¬ 
graphic projection that transforms a hemisphere to a plane is 
used to plot the data. The graphic construction that does this is 
a stereonet (Fig. 4.2-9). 3 On this net, the azimuth is shown by 
the numbers from 0° to 360° around the circumference. The 
dip angles are shown by the numbers from 90° to 0° along the 
net’s equator. The angle 90°, straight down, hits the middle of 
the net, whereas 0°, the horizontal direction, is at the edge. 

To see how to use this net, consider how planes through 
the center of the focal sphere appear (Fig. 4.2-10). A vertically 
dipping, N-S-striking, plane intersects the hemisphere such 
that it plots as a straight line through the center of the net. A 
N-S-striking plane with a different dip intersects the net edge 
at 0° and 180°, but intersects the equator at a position cor¬ 
responding to the dip. For example, planes dipping 70°E and 
60° W intersect the equator at the 70°E and 60°W marks. Thus, 
meridians on the net (the curves going from the top to the 
bottom) represent N-S-striking planes with different dips. 

Planes striking in other azimuths are plotted in a similar way 
(Fig. 4.2-11) by rotating the stereonet. 4 Thus, a plane strik¬ 
ing at an angle (f) (measured clockwise from north) is plotted by 
rotating the stereonet so that the vertical (N-S) axis points in 
the 0 direction. The plane with the desired dip is now a merid¬ 
ian, so it can be found using the scale along the equator. After 
plotting the plane by tracing the appropriate meridian, we 
rotate the net back to its original orientation. Hence planes 
striking in azimuths other than N-S appear as meridians relat¬ 
ive to their strike direction, with the appropriate dip. All of 
these meridians are thus great circles, the curves formed when 
a plane through the center of the sphere intersects the surface of 
the sphere. 

3 Seismologists generally use an equal-area or Schmidt projection, rather than an 
equal-angle or Wulff projection. The techniques used are the same for the two, 

4 This can be done either by the traditional method, rotating a piece of tracing paper 
over a stereonet, or by using a computer program that plots points and planes on a 
stereonet. 
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Fig. 4.2-9 A stereonet used to display a hemisphere on a flat surface. 
The azimuth is shown by the numbers around the circumference, 
and dip angles are shown by the numbers along the equator. 


We can also plot planes perpendicular to a given plane. To 
do this, rotate the stereonet so that the plane lies on a meridian, 
and find the point on the equator 90° from the intersection of 
the plane with the equator (Fig. 4.2-12). This point is the pole 
for the plane, because it represents the point at which the nor¬ 
mal to the plane intersects the sphere. Any plane perpendicular 
to the first plane contains the normal, and hence must pass 
through the pole. To draw such perpendicular planes, remem¬ 
ber that an arbitrary curve on the stereonet does not represent a 
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Fig. 4.2-10 Three planes striking N-S on a stereonet. The meridians 
(curves going from the top to the bottom) represent N-S-striking planes 
with different dips. 


plane; only meridians are projections of planes. We thus rotate 
the net in the desired direction and trace meridians going 
through the pole. 

To determine focal mechanisms, we plot the points where 
rays intersect the focal sphere, so that the nodal planes can be 
found. For example, to plot the point corresponding to a 
ray whose azimuth is 40° and whose take-off angle is 60°, 
we first rotate the net, placing the equator along azimuth 40°. 
Because take-off angles i are measured from the vertical, they 
correspond to dips of 90 - i. We thus mark the point with 
dip 30°E, and rotate the net back so that north is at the top 
(Fig. 4.2-13). 

We can use these ideas to determine the focal mechanism 
from a set of P -wave first motions. First, we find the polarities 
of the first arrivals at seismic stations. Each station corresponds 
to a point on the focal sphere with the same azimuth and an in¬ 
cidence angle corresponding to the ray that emerged there. We 
then plot the location of each station on the stereonet and mark 
whether the first motion is dilatation or compression. Next, by 
rotating the tracing paper or using a stereonet program, we 
find the nodal planes that best separate the compressions from 
the dilatations. In doing this, we ensure that the two planes 
are orthogonal, with each one passing through the pole to the 
other. Provided the distribution of stations on the focal sphere 
is adequate, we can find the nodal planes, which are the fault 
plane and the auxiliary plane. 

Different types of faults appear differently on a stereonet 
(Fig. 4.2-14). The black and white quadrants, representing 



First draw a plane 
with a dip of 60°E 
and a strike of 0°. 


N 
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Fig. 4.2-11 To plot a plane striking N45°E and dipping 60°E, rotate the 
stereonet (or tracing paper above it) so that the strike is at the top and the 
dip can be measured along the equator. After plotting the appropriate 
meridian, rotate the net back to the geographic orientation with north 
at the top. 
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Fig. 4.2-12 Plotting perpendicular planes on a stereonet. First, rotate the 
first plane’s strike to the top of the stereonet, and plot the plane. Next, find 
the pole, the point on the equator 90° away. Any plane through the pole is 
perpendicular to the first plane. Several such planes, with different strikes 
and dips, are shown. 

compression and dilatation, show the fault geometry. A four- 
quadrant “checkerboard” indicates pure strike-slip motion on 
a vertical fault plane. The motion would be right-lateral if one 
plane is the fault plane, and left-lateral on the other. As we 
mentioned earlier, often the distribution of aftershocks or geo¬ 
logic information (or prejudices) is used to infer which was the 
actual fault plane, and thus the sense of slip. A pure dip-slip 
fault that dips at 45° (the fourth quadrant is on the upper focal 
hemisphere) gives a three-quadrant “beachball.” The center re¬ 
gion is compressional for a thrust fault, and dilatational for a 
normal fault. The difference reflects the different direction of 
fault motion, as the side-view cartoon shows. For a dip-slip rup¬ 
ture on a vertical fault, only two quadrants of the “beachball” 
are visible, because the others are on the upper focal hemisphere. 

The pattern is a little more complicated for oblique-slip faults 
with a mixture of strike-slip and dip-slip motion. The mechan¬ 
isms in Fig. 4.2-15 have the same N-S-striking, 45°E-dipping 
fault plane, but with slip directions varying from pure thrust, 
to pure strike-slip, to pure normal. Thus the auxiliary plane 
varies but always passes through the normal to the fault plane, 
and the slip vector can be found because it is the normal to the 
auxiliary plane, and thus is in the fault plane (Fig. 4.2-5). 

It is important to bear in mind that although the focal mecha¬ 
nisms look different, they reflect the same four-lobed P -wave 
radiation pattern (Fig. 4.2-6). However, because the fault 
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Fig. 4.2-13 To plot a point on a stereonet, rotate the azimuth of the 
point to the equator, measure the take-off angle from the vertical (or 
equivalently the dip from horizontal), plot the point and rotate back 
to the geographic orientation with north at the top. 


40° 
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Fig. 4.2-14 Focal mechanisms for earthquakes with various fault 
geometries. Compressional quadrants are black. The strike-slip 
mechanism is for pure strike-slip motion on a vertical fault plane, 
which could be oriented either NE-SW or NW-SE. The pure dip-slip 
mechanisms are for faults striking N-S. 

plane and slip direction are oriented differently relative to the 
earth’s surface, the projections of the radiation pattern lobes on 
the lower focal hemisphere differ. 5 Pure dip-slip motion on a 
45° dipping fault has two lobes along the vertical axis, so the 
nodal planes dip at 45°. By contrast, pure strike-slip motion on 
a vertical plane has lobes in the plane of the surface, and the 
null axis is vertical. 

A common use of earthquake focal mechanisms is to infer 
stress orientations in the earth. As discussed in Section 2.3.4, 
a simple model predicts that the faulting occurs on planes 
45° from the maximum and minimum compressive stresses. 
Equivalently, these stress directions are halfway between the 
nodal planes. Thus the maximum compressive (P) and min¬ 
imum compressive stress (T) axes can be found by bisecting 
the dilatational and compressional quadrants, respectively 
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Fig. 4.2-15 Focal mechanisms for earthquakes with the same N-S-striking 
fault plane, but with slip angles varying from pure thrust, to pure strike- 
slip, to pure normal faulting. 


(Fig. 4.2-16). Although T is called the “tension” axis, it is 
actually the minimum compressive stress, because compres¬ 
sion occurs at depth in the earth. The intermediate stress axis, 
known as the B or null axis, is perpendicular to both the T and 
the P axes. This direction is also perpendicular to both the slip 


5 This concept can be seen by marking the P -wave quadrants on a ball and rotating 
it. For additional insight, the S-wave radiation pattern (Fig. 6) can also be marked on 
the ball. 
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To obtain P and T axes: 



On the meridian connecting 
the poles, the points 
half-way between the 
nodal planes are the 
P and T axes 


Thrust faulting, Vanuatu Islands, July 3, 1985 
Location: 17.2°S, 167.8°E. Depth: 30 km 
Strike: 352°, Dip: 26°, Slip: 97° 


CHTO 



Normal faulting, mid-Indian rise, May 16, 1985 
Location: 29.1 °S, 77.7°E. Depth: 10 km 
Strike: 8°, Dip: 70°, Slip: 270° 



Strike-slip faulting, west of Oregon, March 13, 1985 
Location: 43.5°N, 127.6°W. Depth: 10 km 
Strike: 302°, Dip: 90°, Slip: 186° 



(s) 


Fig. 4.2-17 Focal mechanisms and some seismograms for three different 
earthquakes. Compressional quadrants are shown shaded. 


Fig. 4.2-16 Cartoon illustrating the relation between fault planes and the 
maximum compressive principal stress (P) and the minimum compressive 
stress (T) axes. The P and T axes can be found by bisecting the dilatational 
and compressional quadrants, respectively. On a stereonet, this is done by 
using the great circle (meridian) connecting the poles for the two nodal 
planes and finding the point halfway between them. 

and the normal vectors, and is the intersection of the two nodal 
planes. 

To bisect the angle between the two nodal planes on the 
stereonet, we find the poles for the two planes (each of which is 
in the other plane), draw the great circle (meridian) connecting 
them, and mark the point on it halfway between the poles (Fig. 
4.2-16). We can thus infer stress directions from a focal mech¬ 
anism. Different fault types correspond to different orientations 


of the stress axes, as noted in Fig. 2.3-9. If the P axis is vertical, 
the fault plane dips at 45°, and normal faulting occurs. If, 
instead, the T axis is vertical, the fault geometry is the same, but 
reverse faulting occurs. When the null axis is vertical, strike- 
slip motion occurs on a fault plane 45° from the maximum 
principal stresses, which are in the plane of the surface. 

Figure 4.2-17 shows the focal mechanisms and a few of the 
seismograms for three earthquakes. Note that in some cases the 
first arrival is small and difficult to identify. This is especially 
likely when the station is near a nodal plane, where the ampli¬ 
tude is small. It is also worth noting that often many stations 
plot near the center of the focal sphere, because they are at large 
distances from the source, so rays to them have small angles of 
incidence. As a result, it is sometimes hard to constrain nodal 
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planes, especially if the plane is far from the vertical, as in the 
dip-slip examples shown. In such cases, information about the 
waveforms as well as the polarity of the waves is used, as dis¬ 
cussed later. 


4.2.5 Analytical representation of fault geometry 

In many applications, including seismic moment tensor ana¬ 
lysis, which we discuss shortly, it is useful to have analytic 
expressions for the relations between the fault plane, the auxili¬ 
ary plane, and the stress axes. In Section 4.2.1, we expressed 
the fault normal and slip vectors in a geographic coordinate 
system, such that for a fault with strike (pf, dip angle <5, and 
slip angle X the fault normal and slip vectors are 


n = 


^-sin 8 sin (p f ^ 
-sin 8 cos (j)f , 
cos 8 

V J 


d = 


cos X cos (pf + sin X cos 8 sin (p f 
-cos X sin (pf + sin X cos 8 cos (p f 
sin X sin <5 , 


( 8 ) 


Because the null (or B) axis is orthogonal to the fault normal 
and slip vectors, a unit vector in this direction can be written 


b = nxd = 


-sin X cos (pf + cos X cos 8 sin (p^. 
sin X sin (p f + cos X cos 8 cos (p f 
cos X sin 8 , 


( 9 ) 


(l/2)(t x p) = -£ ijk n f d k = -(n x d), ( 13 ) 

which is just the negative of a unit vector along the null axis, b. 
Thus either the fault normal vector, slip vector, and null axis 
or the P, T, and B (null) axes can be used for an orthogonal co¬ 
ordinate system. 

The relationship between the fault and auxiliary planes can 
be derived from the fact that the slip vector, which lies in the 
fault plane,Js the normal to the auxiliary plane and vice versa. 
Thus if fq, d 2 and n 2 , d 2 are the fault normal and slip vectors for 
the two nodal planes, 


d a = n 2 and ct 


(14) 


Writing out dj = n 2 by components, 

cos X 1 cos + sin X 1 cos 8 t sin (pf 
-cos X x sin (pf + sin X 1 cos 8 X cos (pf 
sin X x sin 8 1 


-sin 8 2 sin (pf 


-sin <T 


cos 


COS 8n 


(15) 


The corresponding relation between n 1 and d 2 is found simply 
by interchanging subscripts. 

These equations relate the strike, dip, and slip angles for one 
plane to the other. To use them, we multiply the first by cos (pf 
and the second by sin (pf, and subtract them to find 

cos X t = sin 8 2 sin ((p fl - (p fl ), (16) 

or, equivalently, 


Similarly, to find vectors p and t along the P and T axes, note 
that they are in the plane containing d and n and lie halfway be¬ 
tween them, so 

t = n + d 6' = n j + d-, 

p =h-d p.=^.-d-, (10) 

b = nxd 


cos X 2 - sin 8 X sin {(pf - (pf). 

We also have the third equation 
cos 8 2 = sin X 1 sin 8 l7 
or, equivalently, 
cos dj = sin X 2 sin 8 2 . 


(17) 


(18) 


(19) 


It turns out that the null axis is perpendicular to both the 
P and the T axes. To see this, we use the cross-product (Eqn 
A.3.43) to form a vector perpendicular to both axes, 

(1/2)(t x p) = (l/2)(n + d) x (n - d) = (^/2)(n ; . + dj)(n k - d k ) 

= ( £ ijk l2 )( n j n k ~ n j d k + d j n k ~ d j d k)> (11) 

and simplify, using 

Axh = £^ ; .^ = °, dxd = e^ y ^ = 0, 

£ ijk d i n k=- £ ijk n j d k> (12) 


An additional constraint comes from the fact that the two 
nodal planes are perpendicular: 

V fi 2 = °> (20) 

so 

sin dj sin (pf sin d 2 sin (pf + sin 8 1 cos (pf sin d 2 cos (pf 
+ cos dj cos d 2 = 0, 

sin dj sin d 2 cos {<p fl ~ (pf)+ cos 8 t cos d 2 = 0; (21) 

or 


to see that 


tan d 2 tan 8 2 cos {(pf - (pf) = -1. 


(22) 
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These equations allow us to find the the second nodal plane 
and the slip vector on it ($y 2 , d 2 , X 2 ) from the first nodal plane 
and the slip on it 8 V X x ). The hard part, getting the angles 
in the appropriate quadrants, can be done by first finding S 2 
from Eqn 18, and then finding sin A 2 from Eqn 19 and cos A 2 
by combining Eqns 16 and 17. Given both sine and cosine, 
X 2 can be placed in the correct quadrant. We then find <fy 2 from 
Eqns 22 and 16. Finally, if 90° < <S 2 < 180°, we change (fy 2 , <5 2 , 
A 2 ) to (180° + </y 2 ,180°-<5 2 , 360°-A 2 ). 

If the nodal planes have been found from first motions using 
a stereonet, the situation differs because the strike and dip of 
both planes are known, but the slip angles are not. We then 
choose one nodal plane and find the slip angle on it. This can be 
done using Eqns 16 and 18 to find cos X 1 and sin X v and then 
placing X x in the correct quadrant. 

4.3 Waveform modeling 

As noted in the previous section, P-wave first motions are often 
inadequate to constrain focal mechanisms. Additional infor¬ 
mation is obtained by comparing the observed body and sur¬ 
face waves to theoretical, or synthetic , waveforms computed 
for various source parameters, and finding a model that best 
fits the data, either by forward modeling or by inversion. Wave¬ 
form analysis also gives information about the earthquake 
depths and rupture processes which cannot be extracted from 
the first motions. We discuss such analysis first for body waves 
and then for surface waves. 


4.3.1 Basic model 

To generate synthetic waveforms, we regard the ground motion 
recorded on a seismogram as a combination of factors: the 
earthquake source, the earth structure through which the waves 
propagated, and the seismometer. Each factor can be thought 
of as an operation whose effects depend on the frequency of 
the seismic waves. Hence it is often useful to think of the 
seismogram u(t) in terms of its Fourier transform U{co ), which 
represents the contribution of the different frequencies: 



U{co)e im dco 


U((D) = 


u(t)e im dt 


( 1 ) 


As as in earlier discussions (Sections 2.8, 3.3, 3.7), we use 
the Fourier transform and related concepts while deferring 
more general treatment of Fourier analysis to Chapter 6. The 
essence of this approach is that we represent a seismogram or 
individual factors that make it up either as a time series or by 
its Fourier transform, depending on which is more convenient, 
and switch back and forth using the transform and inverse 
transform relations. 

This approach to generating synthetic seismograms from 
earthquakes is conceptually the same as that discussed in Sec¬ 


tion 3.3.6 for reflection seismograms. There, we described the 
combined effect of various factors as the convolution of time 
series representing each factor. Recall that the convolution of 
two time series w(t) and r(t) is written 


s{t) = w(t) * r{t) = 


w(t-T)r{T)dT. 


( 2 ) 


Thus a seismogram u{t) can be written 

u(t) = x{t)*e(t)*q(t)*i{t), (3) 

where x(t) is the source time function, the “signal” the earth¬ 
quake puts into the ground, e(t) and q{t) represent the effects 
of earth structure, and i(t) describes the instrument response 
of the seismometer. We also noted (and will prove in Section 
6.3.1) that convolution in the time domain is equivalent to 
multiplication in the frequency domain, so Eqn 3 can be 
written as the product of Fourier transforms of the four factors 

U(cq) = X{(o)E{cd)Q{co)I{cq). (4) 

Each factor can be described in the time domain or the fre¬ 
quency domain. For example, the seismogram depends on how 
the seismometer responds to ground motion of different fre¬ 
quencies. Figure 4.3-1 {top) shows the instrument response, 
the amplification of a signal as a function of period, for a long- 
period seismometer. Ground motion with periods around the 
peak response (T = 15 s) is enhanced relative to that at longer 
or shorter periods. As discussed in Section 6.6, seismometer 
responses differ; some have peak response at short (e.g., 1 s) 
periods, whereas others have better response at longer periods. 
The seismometer response can also be described in the time 
domain by taking its inverse Fourier transform (Fig. 4.3-1, 
bottom ). The resulting time series, z(Z), is the impulse response , 
describing how the seismometer responds to a sharp impulse. 
For the seismometer illustrated in Fig. 4.3-1, the impulse 
response has a sharp initial peak, followed by a smaller 
“backswing.” 

In this formulation, the effects of earth structure are divided 
into two factors. One, e(t), gives the effect of reflections and 
conversions of seismic waves at different interfaces along the 
ray path and the effect of geometric spreading of the rays due 
to the velocity structure (Section 3.4.2). All these effects are 
elastic wave phenomena. There is also anelastic attenuation 
described by q{t ), whereby some of the seismic waves’ mech¬ 
anical energy is lost by conversion into heat. Attenuation, dis¬ 
cussed in Section 3.7, is illustrated by the decay with time of a 
damped harmonic oscillation with frequency 0): 

f{t)=Ae icot e~ cot/2Q . (5) 

The quality factor Q characterizes the attenuation: the ampli¬ 
tude decays by e~ l in a time 2 Q/co (Fig. 3.7-11), so the higher 
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Instrument response (WWSSN 15-100) 



Period (s) 



Fig. 4.3-1 The response of a long-period seismometer. Top-. Gain, or 
magnification, of an arriving signal as a function of period. Bottom : 
Impulse response in the time domain. This seismometer is a long-period 
Worldwide Standardized Seismographic Network (WWSSN) analog 
instrument, a type installed around the world in the 1960s that produced 
many crucial results prior to the advent of digital instrumentation. 

that Q is, the slower the decay, and thus the lower the attenua¬ 
tion. The operators q(t) or Q(©) describe the effect of attenua¬ 
tion over the range of frequencies making up the seismogram 
being synthesized. 

4.3,2 Source time function 

The earthquake source signal, x(t), is the source time function 
produced by the faulting. In the simplest case of a short fault 
that slips instantaneously, the seismic moment function 
(Eqn 4.2.4) is a step function whose derivative, a delta function 
(Section 6.2-5), is the source time function. Real faults, how¬ 
ever, give rise to more complicated source time functions. 
Consider a simple case in which the rupture at each point on 
a rectangular fault radiates an impulse. However, the total 
radiated signal is not impulsive, because the finite fault does not 
all break at the same time. Instead, waves arrive first from the 
initial point of rupture, and later from points further along 


Station 



"Boxcar" 
time pulse 

T r ~ L(Vv r - cosd/v) 

Fig. 4.3-2 For a fault of length L, the duration of the source time function 
varies as a function of azimuth, depending on the ratio of the rupture 
velocity v R and the wave velocity v. 


the fault. Assume (Fig. 4.3-2) that the rupture propagated at 
the rupture velocity v R along a fault of length L. Consider a 
receiver at a distance r Q and azimuth 9 from the initial point of 
rupture. The first seismic arrival is at time rjv where v is either 
a or p, for P or S waves, respectively. The far end of the fault 
ruptures a time L/v R later, giving a seismic arrival at time (L lv R 
+ r/v), where r is the distance from the far end to the receiver. 
The law of cosines shows that 

r 2 = r 2 0 + L 2 -2r 0 Lcos6, (6) 

which, for points far from the fault (r»L), is approximately 

r~r 0 -Lcos6. ( 7 ) 

Thus the time pulse due to the finite fault length is a “boxcar” 
of duration 

T r = L(1/v r ~cos 9/v) = (L/v){vfv R ~c os 6), (8) 

known as the rupture time. Because v R is typically assumed to 
be about 0.7-0.8 times the shear velocity /3, v/v R is about 1.2 
for shear waves and 2.2 for P waves. The maximum duration 
occurs 180° from the rupture direction, and the minimum is in 
the rupture direction. 1 These expressions can be modified for dif¬ 
ferent fault shapes and rupture propagation directions, such as 
rupture propagating outward from the center of a circular fault. 

A familiar analogous effect occurs during thunderstorms. Thunder is generated 
by the sudden heating of air along a lightning channel in the atmosphere. Observers 
in positions perpendicular to the channel hear a brief, loud, thunder clap, whereas 
observers in the channel direction hear a prolonged rumble. Here the minimum dura¬ 
tion occurs at azimuth 90°, and the maxima are at 0° and 180°, because the “rupture 
velocity is much greater than the sound velocity, so v/v R is approximately zero, and 
the time function duration varies as cos 6. (Few, 1980) 
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To 

Derivative (velocity) is a boxcar function 




y^Area = 

6 ~ 90° 

Area = M n Fault 

0=180° Rupture 

direction 

y^Area = /vCX^ 

9 - 270° 

Fig. 4.3-4 Effects of rupture directivity on the source time function at 
different azimuths from the rupture. Because the same energy arrives, the 
area of each source time function, corresponding to the seismic moment, is 
the same. However, in the direction of rupture propagation more energy 
arrives in a shorter time, whereas in the opposite direction less energy 
arrives over a greater duration. 



0 = 0 ° 



Fig. 4.3-3 The source time function depends on the derivative of the 
history of slip on the fault. A ramp time history {top) with duration T D has 
a “boxcar” time derivative. When convolved with the “boxcar” time 
function due to rupture propagation {center), a trapezoidal source time 
function results {bottom). 

A second effect lengthening the time function is that, even at 
a single location on the fault, slip does not occur instantane¬ 
ously. The slip history is often modeled as a ramp function 
(Fig. 4.3-3) that begins at time zero and ends at the rise time t d- 
The source time function depends on the derivative of the slip 
history, as noted in Section 4.2.3. For a ramp, the derivative is a 
“boxcar.” Convolving the finiteness and rise time effects yields 
a trapezoid whose length is the sum of the rise and rupture 
times, which is often used to represent an earthquake source 
time function. Other shapes of comparable length, like trian¬ 
gles, are also used, because (as we will see) seismograms are 
often insensitive to the details of the source time function. 
FFowever, we will also see that for large earthquakes, body 
wave modeling can resolve a more complicated time function 
corresponding to the variation in slip along the fault as a 
function of space and time. 

The radiated pulse varies in time duration as a function of 
azimuth from the rupture direction, due to the finite rupture 
length (Eqn 8). Because the area of the pulse is the same at all 
azimuths, the magnitude of the source time function varies 
inversely with its duration (Fig. 4.3-4). In some cases these 
effects, called directivity , can be used to identify the fault plane 
(because no similar effect is associated with the auxiliary plane) 
and study the rupture propagation. Directivity is related to 
the Doppler effect for sound and light waves, which shifts the 
frequency of a moving oscillator to higher frequency when the 
oscillator moves toward an observer, and lower frequency when 
it moves away. FFowever, directivity results from interference 


between different parts of a finite fault, whereas the Doppler 
effect in its simplest form occurs for a moving point source. 2 

An interesting question is when we need to consider the 
effects of a finite earthquake source. We have shown (Eqn 8) 
that the difference in the arrival time of waves traveling at 
velocity v from different parts of the fault with length L is the 
rupture time T r , which is approximately L/v. If this difference 
is comparable to the period of the seismic wave, the arriving 
waveform will be significantly affected. Thus, when the ratio 

Zr_ _ Llv _ _E (9) 

T Xlv X 

is small, the fault length is short compared to the wavelength of 
the seismic waves, and we can neglect the finiteness of the 
source and treat it as a point. This criterion is similar to that 
noted in Section 3.2.3, that seismic waves cannot “see” earth 
structures much smaller than their wavelengths. For a finite 
fault, this occurs because the rupture velocity is comparable to 
the seismic velocity. 

An interesting consequence of Eqn 9 is that a fault can seem 
finite for body waves, but not for surface waves. A 10 km-long 
fault, which we might expect for a magnitude 6 earthquake, is 
comparable to the wavelength of a 1 s body wave propagating 
at 8 km/s, but small compared to the 200 km wavelength of a 
50 s surface wave propagating at 4 km/s. On the other hand, 
a 300 km-long fault for a magnitude 8 earthquake would be a 
finite source for both waves. 

4.3.3 Body wave modeling 

The elastic structure operator e(t) representing the effects of re¬ 
flections and transmissions along the ray path primarily reflects 
interactions near the earth’s surface, where the largest change 

2 The Doppler effect is used to detect motion in applications ranging from police 
and weather radar to astronomical studies of “red-shifted” light that show the uni¬ 
verse expanding. For discussion of the relation between directivity and the Doppler 
effect, see Douglas etal. (1988). 
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Fig. 4.3-5 The P -wave arrival waveform for a deep earthquake combines 
the effect of the source time function, attenuation, and the instrument. 
Near-source structure can be neglected because surface reflections arrive 
much later. (After Chung and Kanamori, 1980. Phys. Earth Planet Inter., 
23, 134-59, with permission from Elsevier Science.) 




Fig. 4.3-6 Top: The P -wave arrival for a shallow earthquake at distance 
30 < A < 90° from the source is modeled as the sum of arrivals due to the 
direct P wave and the free surface reflections pP and sP. Bottom: Geometric 
construction used to derive the delay time of pP with respect to direct P. 


in physical properties occurs. It is thus useful to consider two 
simple cases. For a deep earthquake, the surface reflections and 
other reflected, refracted, and diffracted arrivals arrive much 
later than the direct P wave, so we can describe the direct P 
wave without them. Moreover, at distances 30° < A < 90° from 
the source, the effects of upper mantle triplications and core 
structure (Section 3.5) can be ignored. Thus, the structure 
operator can be neglected, and only the source, attenuation, 


and seismometer are considered to describe the first pulse on a 
seismogram (Fig. 4.3-5). 

On the other hand, for a shallow earthquake, reflections 
off the earth’s surface arrive shortly after the direct arrival. 
We thus model the first few seconds of the P-wave arrival as 
the sum of three arrivals (Fig. 4.3-6, top ); the direct P wave, the 
P wave reflected from the surface (pP), and the 5 wave that 
converted to a P wave at the surface (sP). 

The two surface reflections arrive after the direct P wave. 
Figure 4.3-6 ( bottom) shows that pP is delayed with respect 
to P by approximately 

$t pP =(2k cos i)/a, (io) 

where i and a are the incidence angle and velocity for P waves. 
A messier calculation shows that for a Poisson solid, sP is 
delayed by 

8t sP = (h/oc){cos i+ (3 -sin 2 z) 1/2 ). ( 11 ) 

For shallow earthquakes the initial waveform reflects all 
three arrivals. For example, for a source 10 km deep in a 
medium with a = 6.8 km/s, the time delays 8t pP and 8t sP are 
2.7 s and 3.8 s at a distance A = 50°, where the incidence angle 
is 24°. These arrivals are hard to resolve from the P arrival, 
because the seismometer’s impulse response (Fig. 4.3-1) is 
long enough that it has not completely responded to the direct 
arrival before the others arrive. 

The four factors in Eqn 3 can be combined to synthesize 
body waves. Although the derivation has some subtleties, 
the result reflects the basic ideas just discussed. The displace¬ 
ment as a function of time, distance, and azimuth, for an initial 
P-wave arrival at distances 30-90° from the source, is 


it. A, 0) = /(f) *q(t)* - M ° lA C(i 0 ) x 

4 m a h a 

R p (<t>, i h )x(t - r p ) + R p (<l>, 71 - i h )Yp ? (i h )x(t - T pF ) 


Ph cos ] h 


■ ( 12 ) 


This formulation includes the seismometer and attenuation 
factors and a complicated-looking third term incorporating 
the source and structure factors. This term has distinct pieces, 
each with a physical interpretation. The amplitude scale factor 
M^I^KphOtl) contains the earthquake’s seismic moment M 0 
and the density and P-wave velocity at the source depth h. 
The g(A)/a factor, where a is the earth’s radius, describes the 
amplitude variations due to geometric spreading of rays. The 
C(* 0 ) factor corrects the amplitude for the effects of the free 
surface, where the rays arrive at the receiver at an angle of 
incidence, i Q . 

The term in brackets has three parts, corresponding to P, pP y 
and sP. Each includes the source time function x(t) lagged by 
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Fig. 4.3-7 Cartoon illustrating the relative 
polarities and amplitudes of the direct P wave and 
the near-source free surface reflections pP and sP 
for different focal mechanisms. The arrivals are 
shown as impulses, and then including the effects 
of attenuation and the seismometer. (Okal, 1992. 
© Seismological Society of America. All rights 
reserved.) 
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the travel time for that ray, T p , t pp , and T sP . Each arrival’s 
amplitude depends on the body wave radiation pattern at the 
source for that wave type 


P p (0, i)=s R (3 cos 2 i-l)-q R sin 2 i-p R sin 2 i, 

3 1 

£ SV (0> /') = -s R sin 2; + q R cos 2; + -p R sin 2/, 

R SH (0, /) = —q L cos /' - p L sin 


which depend on the take-off angle (i for P waves and / for 
S waves) and a set of fault geometry factors which include 
the fault strike, dip and slip angles (Fig. 4.2-2) ty, 8 , 2, and the 
azimuth (f) (clockwise from north) to the station. For P - 5V 
waves these factors are 


s R = sin 2 sin 8 cos 8 , 

q R ~ sin X cos 2d sin (ty- (j>) + cos X cos <5 cos (0^-0), 
p R = cos 2 sin dsin 2(dy-0) - sin 2 sin <5cos 8cos2{(j)f- (p), 


and those for SH waves are 

p L = sin 2 sin <5cos dsin 2(0^-- 0) + cos 2 sin <5 cos 2(0^-0), 
q L --c os 2cos <5sin (0^™ 0) + sin 2cos 2dcos {ty— 0). (14) 

The reflected phases’ amplitudes also include the plane wave 
potential reflection coefficients at the free surface, n pp (4) and 
II SP {j h ), which depend on the angles of incidence. Finally, the 
sP term is scaled by a factor (a h cos i h ) /( p h cos j h ) which incor¬ 
porates several effects, including the fact that near the source 
the wave incident on the surface is better treated as a spherical 
wave than a plane wave. 

We could similarly model the SH wave, which arrives much 
later, by summing direct S and sS using an expression analo¬ 
gous to Eqn 12, with the 5-wave velocity, take-off angles, delay 
times, and the 5H-wave radiation pattern R SH . 

This formulation shows how synthetic body wave seismo¬ 
grams depend on the assumed focal depth, which determines 
the time separation between arrivals, the mechanism, which 
determines the relative amplitudes of the arrivals, and the 
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Fig. 4.3-8 Body wave modeling procedure for depth determination. 
Synthetic seismograms for an assumed fault geometry, including the 
effects of the seismometer and attenuation, are calculated for various 
depths. The data are best lit by a depth near 30 km. (Stein and Wiens, 
1986. Rev. Geophys. Space Rhys., 24, 806-32, copyright by the American 
Geophysical Union.) 


source time function, which determines the pulse shape. Fig¬ 
ure 4.3-7 illustrates this concept for P waves from two dip-slip 
faults, one dipping vertically and the other at 45°. The arrivals 
are shown first as impulses and then after convolution with the 
seismometer and attenuation operators. In one case pP leaves 
the focal sphere (shown in side view) with the same polarity 
as P, whereas in the other it leaves with opposite polarity. 
Its polarity then reverses at the free surface. Thus pP on a 
seismogram need not have the opposite polarity from P. 
Similar effects occur to sP. As a result, the relative polarities 
and amplitudes of the arrivals vary with the mechanism, 
making the seismogram a useful diagnostic. 

Source parameters can be studied by generating synthetic 
seismograms for various values and finding the best fit to the 
data, either by forward modeling (“trial and error”) or by 
inversion. Often first motion, body wave, and surface wave 
analyses (discussed next) are combined. Although first motion 
data are often consistent with various focal mechanisms, the 
different methods used together generally yield a consistent 
and better constrained result. 

Figure 4.3-8 shows an example for an earthquake near the 
Sumatra trench, whose mechanism was reasonably well con- 
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Fig. 4.3-9 Synthetic P -wave seismograms for an earthquake occurring 
beneath the ocean, modeled both without and with a distinct crustal layer. 
The crustal layer has a smaller effect than the water layer. (Stein and 
Kroeger, 1980. Reproduced with the permission of the American Society 
of Mechanical Engineers.) 


strained by first motions. To check the mechanism and estimate 
the depth, synthetic seismograms were computed for various 
focal depths. The left panel shows the expected timing and 
amplitudes of various arriving phases, and the right shows the 
synthetic seismogram resulting from including the effect of the 
source (assuming a trapezoidal time function), seismometer, 
and attenuation. The data are fit well by a source at a depth 
near 30 km. Because the earthquake occurred beneath the 
Indian Ocean, some rays reflected at the sea surface, and others 
reflected at the sea floor. The sea floor reflection, p w P , should 
have the same polarity (up) as pP, as observed. This method can 
be extended to include the effects of crust and upper mantle 
structure. As shown in Fig. 4.3-9, a crustal layer has less effect 
than the water layer, because the water layer has a greater con¬ 
trast in velocity and density. 

Such depth determinations from body wave modeling are often 
better than those provided by earthquake location programs 
using arrival times. For example, the International Seismolo- 
gical Center assigned the earthquake represented in Fig. 4.3-8 a 
depth of 0 ± 17 km. Even if the depth is restricted to be within 
the earth, the modeling shows that this solution is too shallow. 

How well the details of the time function can be resolved 
depends on factors including the type of seismometer used and 
the size of the earthquake. One important factor is the distance 
between the source and the receiver, which influences the 
amount of attenuation. As the pulse travels, the high frequen¬ 
cies that determine the pulse shape are preferentially removed 
by attenuation, because the amplitude (Eqn 5) decays by Me in 
a time 2Q/rn, so higher frequencies decay faster for a given Q. 
Thus the seismogram is smoothed by the effects of both attenu¬ 
ation and the seismometer (Fig. 4.3-9), especially for long- 
period seismometers, which also suppress high frequencies 
(e.g., Fig. 4.3-1). As a result, body wave pulses at teleseismic 
distances can look similar for different source time functions 
of approximately the same duration (Fig 4.3-10). Conversely, 
the best resolution for the details of source time functions is 
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Fig. 4.3-10 Comparison of seismograms synthesized at a teleseismic 
distance with different source time functions. The effects of the 
seismometer and attenuation make it difficult to resolve some of the details 
of the time function. (Stein and Kroeger, 1980. Reproduced with the 
permission of the American Society of Mechanical Engineers.) 


given by strong motion records close to an earthquake and 
broadband seismometers with uniform response over a wide 
frequency range. 

Larger earthquakes typically occur on longer faults, and 
thus have longer-duration time functions. As a result, it is often 
possible to resolve details of the slip process. For example, Fig. 
4.3-11 shows complex waveforms from the 1976 Guatemala 
earthquake. 3 The synthetic seismograms fit the data by assum¬ 
ing that the source consisted of a number of separate sub-events 
along the fault. Such studies can offer useful insight into the 
faulting process by showing how the amount and geometry of 
slip varied along the fault. 

A useful way to estimate source time functions is based on 
the Green’s function , 



Fig. 4.3-11 Data and synthetic seismogram for the large (M s 7.5) 1976 
Guatemala earthquake. The source is modeled as a series of sub-events 
along the fault, with positions, timing, relative amplitudes, and 
mechanisms shown, which gives rise to the complex waveform observed. 
(After Kikuchi and Kanamori, 1991. © Seismological Society of America. 
All rights reserved.) 
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u(t) = Tj) * g{t) * i{t)]. (17) 

;=i 


g{t) = e(t)*q{t), (Id) 

combining the elastic and anelastic effects of propagation from 
the source to the receiver. The Green’s function thus describes 
the signal that would arrive at the seismometer if the source 
time function were a delta function. Hence the earthquake’s 
source time function is found by deconvolving the Green’s 
function and the seismometer from the seismogram u{t) 

x{t) = u(t) * [g(*) * i(G]“\ X(a>) = . -. (16) 

G{co)I{co) 

As we discussed for reflection seismograms (Section 3.3.6), 
deconvolution can be done in either the time or the frequency 
domains. Dividing spectra in the frequency domain is easier, 
but requires care to avoid dividing by small amplitudes which 
can occur at some frequencies. 

Large complex earthquakes can be modeled using Green’s 
functions derived for a simple source in the fault region. The 
seismogram is treated as the sum of source time functions with 
different amplitudes, C-, at different times, T-, 

3 This M s 7.5 earthquake, on the Motagua fault which is a transform segment of the 
boundary between the Caribbean and North American plates (Fig. 5.2-4), caused 
enormous damage and 22,000 deaths. 


With high-quality data, we will see in Section 4.5.3 that it 
is possible to go the next step and estimate how the seismic 
moment release varied on the two-dimensional fault surface as 
a function of time during the rupture. 

4.3.4 Surface wave focal mechanisms 

Surface waves can be modeled in a conceptually similar way to 
body waves, and also help resolve earthquake focal mecha¬ 
nisms and depths. In contrast to body wave modeling, which 
we considered in the time domain using ray theory, we pose 
surface wave modeling in the frequency domain using a formu¬ 
lation derived from the traveling wave approximation to the 
earth’s normal modes (Section 2.9.6). Thus, for surface waves 
the contributing factors appear as products of their Fourier 
transforms (Eqn 4), whereas for body waves (Eqn 12) they 
appear as convolutions in the time domain (Eqn 3). 

We model the transverse component of a Love wave seismo¬ 
gram observed at angular distance 0 and azimuth 0 from an 
earthquake by its Fourier transform 

U(co, 0) = . e - M4 e- imelc V(a>, 

^/sin 0 

V(co,(l>) = p L P L (co) + iq L Q L {co). (18) 
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Here a is the earth’s radius, and c and u are the phase and group 
velocities (Section 2.8.1) at this frequency. The (mnll) term 
where m is the number of times the wave passed the epicenter 
or its antipode, is called the polar phase shift. 4 M(co) represents 
the earthquake's seismic moment release as a function of 
frequency, and thus can incorporate effects of the source time 
function. Fault finiteness is included, using a frequency domain 
cumulation akm to that for body waves (Eqn 8). Except for 
arge earthquakes, M(co) can typically be regarded as a constant 
equal to the scalar moment. 

Several terms model the effects of propagation away from 
the source. The decaying exponential e ~ (Dad/1 Q u i s a formula- 
tion of the attenuation for surface waves, derived from Eqn 5 
wit^ aO u giving the travel time and Q being the quality factor 
at t is requency. The phase as a function of position is given 
by the complex exponential The 1/Tin0 term 

describes the amplitude decay due to geometric spreading as 
the wavefront moves away from the source. Thus 6>is the actual 
distance the wave traveled, including any 2k terms 
The term V(co, 0), which describes the radiation pattern as a 
function of frequency and the azimuth 0, contains two sets of 
factors The excitation functions P L (co) and Q L (co), which are 
derived from the radial eigenfunctions for torsional modes 
with the appropriate frequency, are functions of frequency and 
the elastic constants at the source depth. These functions 
weight the SH-w ave fault geometry factors p, and q, (Eqn 14) 
Because the radiation pattern is a complex number, we can 
write both amplitude and phase radiation patterns for a given 
frequency as a function of azimuth 


2 - 11/2 


(19) 


\V(w,<p)\ = [(p L P L (co)) 2 + (q L Q L (co)) 2 ] 

<f>(e>, <t>) = tan" 1 Uq L Q L (w))l(p L P L (a>))}. 

Similarly, we can synthesize the vertical component of 
Kayieigh waves using 

U(CO, e, 0) = Mfcy e M4 e -io>t,e/cy( w 0\ e -o>ae!2Qu imn/2 

ysin e ’ ’ 

V((Q, <P) = s R S R (w)+p R P R {a>) + iq R Q R ( C0 ) (20) 

The radiation pattern V(co, 0) contains excitation functions 
■y ®)> Pr(co), and Q R (co), derived from the radial eigenfunc¬ 
tions of spheroidal modes, together with the P-SV fault geom¬ 
etry factors s R , q R , and p R (Eqn 14). 

Theoretical surface wave spectra can be computed for any 
fault geometry using the radiation pattern. For example, a ver¬ 
tically dipping dip-slip fault has s R = p R = 0, q R = -sin (0,- 0), 
o the only excitation function on which the radiation pattern 
depends is Q R . Alternatively, for a vertically dipping strike-slip 
aU t! S R ~ ~ dj Pr = sin 2(tpf - 0 ), so the radiation pattern 

1 This shift arises from the (/ + 1/2)6 in the approximation used to convert normal 

”ort S anphc V at 8 t WaVeS W«i Aki and Richards, 1980). 

or its application to equalization, see Kanamori (1970a). 
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Fig. 4.3-12 Focal mechanisms and surface wave amplitude radiation 
patterns for six fault geometries. The mechanisms all have one fault plane 
with a strike of 0°, and the radiation patterns are for a source of constant 


depends o n P R . Thus Rayleigh wave spectral amplitudes for 
vertically dipping dip-slip and strike-slip faults vary with 
azimuth as sin (0^-0) and sin 2 (<pr- 0). 

Figure 4.3-12 shows theoretical amplitude radiation patterns 
for Love and Rayleigh waves corresponding to several focal 
mechanisms, all with a fault plane striking north (0°). The 
patterns are distinctive: a vertical strike-slip fault has two 
four- obed patterns, whereas a 45°-dipping dip-slip fault has a 
tour-iobed Love wave pattern and a two-lobed Rayleigh wave 
pattern. These radiation patterns are computed for the same 
seismic moment, and thus show that a vertical strike-slip earth¬ 
quake is much more efficient at generating Love waves than a 
vertical dip-slip one. A 45°-dipping oblique-slip mechanism is 
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Fig, 4.3-13 Determination of a focal 
mechanism using surface wave amplitudes. 
Although P-wave first motions cannot 
constrain both nodal planes, the second 
plane is constrained by matching the 
observed Love and Rayleigh wave radiation 
amplitude patterns. (Stein, 1978.) 



Love Rayleigh 


intermediate between the 45°-dipping strike-slip and the 45°- 
dipping thrust mechanisms, and so are the corresponding Love 
and Rayleigh radiation patterns. Such patterns can be gener¬ 
ated for any fault geometry and compared to observations to 
find the best-fitting source geometry. * 

To do so, seismograms are Fourier-analyzed to determine 
the spectral amplitudes at certain frequencies. We can then 
either model the amplitude at each station, or generate the 
observed radiation pattern by an equalization correction which 
simulates a common source-station distance. To do the latter, 
observations at distance fi, with Fourier transform U{co 7 fi, 0), 
are equalized to a distance 6 0 using 




The {mull) term, where m is the number of times the path con¬ 
necting 0 and 6 0 goes through the epicenter or its antipode, is 
the polar phase shift. 

Equalization ideally removes all propagation effects, so the 


spectral amplitude as a function of azimuth should reflect the 
source’s radiation pattern and be comparable to theoretical 
patterns. Figure 4.3-13 shows an example for a normal faulting 
earthquake in the diffuse plate boundary zone of the Indian 
Ocean (Fig. 5.5-5), using Rayleigh and Love waves with the 
source-receiver paths indicated. Because the first motion 
data constrained only one E-W striking, north-dipping, nodal 
plane, the second plane was derived by matching theoretical 
surface wave amplitude radiation patterns (smooth lines) to the 
equalized data. Although the observed radiation patterns are 
somewhat jagged, the fault geometry shown is consistent with 
the first motions and matches the maximum and minimum 
amplitude directions of the surface waves. 

The equalized data in Fig. 4.3-13 are not as smooth as the 
theoretical pattern, both because of noise in the data and be¬ 
cause the equalization assumes that the attenuation and group 
velocity are the same for all paths, whereas in reality they vary. 
As a result, the amplitudes at some stations are higher or lower 
than predicted. It is possible to reduce this effect by correct¬ 
ing for velocity and Q structure. Even without doing so, such 
analyses are often valuable for mechanism studies, even for 
moderate-sized earthquakes like in this example. Phase radia¬ 
tion patterns can also be used, but are generally more sensitive 
to lateral variations in velocity. 
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Fig. 4.3-14 Surface wave depth determination uses the variation in Rayleigh wave excitation functions with period and source depth (top) (Romanowicz 
and Guillemant, 1984. © Seismoiogical Society of America. All rights reserved.) For example (bottom), the Rayleigh wave spectrum shown is best fit by a 
4-5 km focal depth. (Tsai and Aki, 1970. /. Geophys. Res., 75, 5729-43, copyright by the American Geophysical Union.) 

Surface waves can also provide information about earth- Surface waves can also be used to study fault length and rup- 
quake depths because the excitation functions depend on period ture for large earthquakes. Figure 4.3-15 shows an analysis for 
and source depth, as shown in Fig. 4.3-14 (top) for Rayleigh the great 1964 Alaska earthquake, the second largest ever 

waves. The excitation decreases with source depth, as expected instrumentally recorded (Fig. 1.2-2). The focal mechanism 

for fundamental mode Rayleigh waves. For a shallow source and geodetic data imply thrust faulting on a roughly NE-SW- 

Q r (co) goes to zero, because this term is proportional to the striking, shallow NW-dipping fault, due to the subduction of 

shear stress generated by the wave, which is zero at the free the Pacific plate beneath North America (Fig. 5.2-3). The earth- 

surface. Figure 4.3-14 (bottom) compares an observed surface quake was so large that surface waves were unusable until their 

wave amplitude spectrum to that predicted for various source amplitude had decayed enough, by the fifth station passage (R5 

depths, with the best fit for 4—5 km depth. This process can be and G5, Fig. 2.7-3). From Fig. 4.3-12, we would expect both 

formalized by computing the error as a function of assumed the Love and Rayleigh wave amplitude radiation patterns to 

source depth and seeking the depth that provides the best fit. have minima in the strike direction. However, the observed 
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Fig. 4.3-15 Focal mechanism for the great 
1964 Alaska earthquake, and the surface 
wave radiation patterns it predicts if the 
source is treated as a point (top left). Love 
and Rayleigh waves are shown as solid and 
dashed lines, respectively. The observed 
patterns (jagged lines) are quite different, 
but are reasonably consistent with those 
predicted by a finite source propagating 
southwestward along the 600 km-long 
fault plane, consistent with the large 
aftershock area [bottom). (Kanamori, 1970b. 
J. Geophys. Res., 75 , 5029-40, copyright by 
the American Geophysical Union.) 



amplitude radiation patterns are quite different, and modeling 
shows them to be consistent with rupture propagating south- 
westward along the 600 km-long fault plane. This dimension is 
consistent with the large aftershock area, and together with the 
seismic moment (Section 4.6) implies an average fault slip of 
about 7 meters, bearing out the gigantic nature of the earth¬ 
quake. 5 In fact, postseismic deformation is still observed with 
geodetic data (Fig. 4.5-15). 

4.3.S Once and future earthquakes 

Combining body and surface wave modeling with first motions 
is often valuable for studying seismograms from older earth¬ 
quakes. This application arises often in tectonic studies, be¬ 
cause in many cases the largest earthquakes occurred prior to 
the development of global seismic networks in the early 1960s 
(Section 6.6). Since about 1930, a few stations have reported 
first motions to the International Seismological Summary. The 
number of points per earthquake is far less than that avail¬ 
able for a modern study, and the data from nonstandardized 
seismometers are often discordant. However, in some cases 
body and/or surface wave modeling is useful, especially if the 
first motions constrain at least one nodal plane. One technique 
is to use the ratio of Love and Rayleigh wave amplitudes. 

J Some of the earthquake damage is shown in Fig. 1.2-11. 


This discussion brings out an important difference between 
first motion and modeling studies. For first motion studies, 
all we need to know about the seismometer is the polarity, so 
compressional arrivals are in fact “up” on the seismograms. 
However, modeling requires knowing the response of the 
instruments. Fortunately, modern instruments are (at least in 
theory) standardized, and their calibration can be checked. 
This is a problem for studies of older earthquakes, because 
calibrations were often quite poor. 

In recent years, modeling approaches have become steadily 
more powerful. High-quality data from digital broadband 
seismometers (Section 6.6) have become standard. In addition, 
laterally homogeneous models for seismic velocity and attenu¬ 
ation have been developed and improved. As a result, inver¬ 
sions of body and surface wave data for many earthquakes, as 
discussed in the next section, are giving large focal mechanism 
datasets for tectonic and earthquake source studies. 

4.4 Moment tensors 

4.4.1 Equivalent forces 

Our approach so far in this chapter has been to view earth¬ 
quakes as due to slip on a fault and to estimate their source 
parameters by forward modeling the radiated seismic waves. 
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seismic radiation. Although these forces are a seismic source 
equivalent to the fault motion, they do not describe the actual 
fracture process. Equivalent body forces can also be derived 
for other seismic sources, such as explosions, landslides, or 
impacts on the earth’s surface. These phenomena can generate 
observable seismic waves when they occur rapidly enough 
(over times less than about an hour) that they release energy 
into the earth in the seismic wave frequency band (Fig. 2.4-7). 
If the energy is released more slowly, propagating seismic 
waves are not excited, although slower crustal deformation can 
be recorded using geodetic methods (Section 4.5.1). 

Figure 4.4-1 illustrates the forces we consider. As noted earl¬ 
ier, earthquakes involving slip upon a fault are modeled as a 
double couple composed of four forces. However, this combi¬ 
nation is just one possible combination of forces. Thus we first 
consider single and double forces, and then work up to double 
couples. 


Fig. 4.4-1 Equivalent body force descriptions of a single force, a single 
couple, and a double couple. The force couple can take two forms. One, 
shown for M xy , has two forces f offset by distance d such that a torque is 
exerted. The other, shown for M xx , is a force dipole which exerts no 
torque. Slip on a fault can be described by the superposition of either 
couples like M xy and M yx or dipoles like M x , x , and -M y , y ,. 

We now generalize this approach to include other types of seis¬ 
mic sources. This formulation, using the seismic moment tensor , 
gives additional insight into the rupture process and greatly 
simplifies inverting seismograms to estimate source parameters. 

We begin by returning to the concept of finding the seismic 
waves generated by earthquakes due to slip on a fault by solv¬ 
ing the equation of motion with the faulting represented by 
equivalent body forces (Section 4.2.3) that yield the same 


4A.2 Single forces 

Outside of exploration applications, most seismograms result 
from earthquakes. However, other geophysical phenomena 
generate seismic waves that are sometimes modeled as single¬ 
force sources. A striking example is the large seismic waves 
generated by the 1980 explosive eruption of Mt St Helens, one 
of the Cascade volcanoes reflecting the subduction of the Juan 
de Fuca plate beneath North America (Fig. 5.2-3). The Fove 
and Rayleigh wave radiation patterns (Fig. 4.4-2) are two- 
lobed, of comparable amplitude, and rotated 90° from each 
other. Consideration of the patterns for double-couple fault 
sources shows that such a lobe pattern is expected only for a 
vertical dip-slip fault (Fig. 4.3-12), and that in this case the 
Love waves should be much smaller than the Rayleigh waves. 
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Fig. 4.4-2 Top: Observed surface wave 
amplitude radiation patterns from the May 
18,1980, blast atMt St Helens. Bottom : 
Theoretical radiation patterns for several 
seismic sources. Only the horizontal force 
yields two-lobed Love and Rayleigh wave 
patterns of comparable amplitude, rotated 
90° from each other. (Kanamori and Given, 
1982./. Geophys. Res., 87, 5422-3, 
copyright by the American Geophysical 
Union.) 
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Fig. 4.4-3 Modeling of the November 18, 
1929, earthquake and landslide off the 
Grand Banks. The slump raptured trans- 
Atlantic cables (solid lines, right) at several 
places (crosses). In this study, the S waves 
are modeled with a single force with the 
source time function shown (left) 
representing the slump. Other studies treat 
the slump as resulting from an earthquake. 
(Hasegawa and Kanamori, 1987. 

© Seismological Society of America. 

All rights reserved.) 



Of the likely non-double-couple sources, both a vertical force 
and an explosion would produce no Love waves and a circular 
(rather than lobed) Rayleigh wave radiation pattern. How¬ 
ever, a horizontal force can reproduce the observed radiation 
patterns. The seismic source has thus been modeled with a 
southward-pointing single force, opposite the direction of the 
north-directed explosion and northward-flowing landslide. 
The modeling gives estimates of the force involved in the land¬ 
slide and explosion, which devastated more than 250 square 
miles (640 km 2 ) on the north side of the mountain. This explo¬ 
sion is equivalent to an M s 5.2 earthquake, significantly bigger 
than the smaller earthquakes often associated with magma 
movements within volcanoes. 

Landslides have also been modeled by a single force in the 
direction opposite that of the rock flow. Figure 4.4-3 illustrates 
this for a large underwater slump (a kind of landslide in which 
the mass of rock moves as a coherent body) associated with the 
1929 M s 7.2 Grand Banks earthquake. This earthquake, one 
of the largest in a minor zone of seismicity along the Atlantic 
continental margin of Canada (Section 5.6.3), was notable 
because the slump generated powerful sediment flows, known 
as turbidity currents, which ruptured telephone cables and 
hence provided important evidence on the speed and force of 
such currents. As shown, the observed S waves are reasonably 
well modeled by synthetic seismograms for a horizontally ori¬ 
ented single force, implying that the slump itself was the seismic 
source. However, another study found that the seismograms 
were well modeled by a double-couple earthquake at about 
20 km depth, which triggered the slump. The issue of whether 
it takes an earthquake to generate such slumps is interesting 
because such mass movements, which might occur on many 
heavily sedimented continental margins, can also generate sig¬ 
nificant tsunamis (Section 1.2.4). The tsunami for this earth¬ 
quake caused 27 deaths along the Canadian coast, and a slump 
following an M 5 7.0 earthquake is thought to have caused the 
devastating 1998 New Guinea tsunami which caused over 
2000 deaths. 


Meteor impacts should, in principle, generate significant 
seismic waves. Impacts have been detected seismologically on 
the moon, but not on earth, where only large meteorites survive 
passage through the atmosphere. Although it might seem that a 
meteor impact should be modeled as a vertical force, this would 
probably not be correct, because the impact’s energy would 
vaporize rock and cause a spherically symmetric explosion 
similar to an underground nuclear detonation. This idea is sup¬ 
ported by the observation that craters produced by meteorites, 
which are believed to have impacted at very oblique angles, are 
essentially symmetrical. As we will see, spherically symmetric 
explosions can be modeled by a set of three orthogonal force 
couples. 

4.4.3 Force couples 

A force couple consists of two forces acting together. These are 
similar in concept to electromagnetic dipoles, like that used to 
model the earth’s magnetic field. Two basic couples are shown 
in Fig. 4.4-1. One consists of a pair of forces offset in a direc¬ 
tion normal to the force. The couple M xy consists of two forces 
of magnitude /j separated by a distance d along the y axis, that 
act in opposite (±x) directions. The magnitude of M xy is fd , 
which in seismology is given in dyn-cm or N-m. To model a 
couple acting at a point, the limit is taken as d goes to zero such 
that the product fd stays constant. 

The other type of couple, a vector dipole , consists of forces 
offset in the direction of the force. M xx consists of two forces of 
magnitude f acting in the ±x directions, separated by d along 
the x axis. The magnitude is fd , and the limit is taken in the 
same way. The difference between the two couple types is that 
the second exerts no torque. 

Combining force couples of different orientations into the 
seismic moment tensor M (Fig. 4.4-4) gives a general description 
that can represent various seismic sources. No geophysical pro¬ 
cesses have been found that are best modeled as single couples, 
probably because such couples would generate large torques 
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Fig. 4.4-4 The nine force couples which are the components of the seismic 
moment tensor. Each consists of two opposite forces separated by a 
distance d (dashed line), so the net force is always zero. 
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and thus observable rotations of the earth about different axes. 
The double and triple sets of couples used to model earthquakes 
and explosions, respectively, do not generate net torques. 1 

4.4.4 Double couples 

Figure 4.4-1 illustrates the relation between an earthquake’s 
fault geometry and the double couple of equivalent body 
forces. For this example, left-lateral strike-slip in the ±y direc¬ 
tions on a fault in the y-z plane, the equivalent body forces 
M + M yx make up the double-couple source. The M yx couple 
seems intuitive, because the forces point in the slip directions, 
but the M xy couple is also needed for reasons including avoid¬ 
ing net torque on the fault. 

Because the equivalent body forces are a double couple, they 
would be the same if the slip were instead right-lateral on a 
fault in the x-z plane. Thus, as we have noted, seismic waves 
from a point double-couple source are the same regardless of 
which plane is the fault plane and which is the perpendicular, 
auxiliary plane. 

The magnitude of the equivalent body forces is M 0 , the scalar 
seismic moment of the earthquake, which has units of dyn-cm, 
like those of a force couple. Thus if M xy and M yx are couples of 
unit magnitude, the moment tensor is 

M = M 0 (M xy + M yx ). (1) 

1 Earthquakes can cause measurable changes in the earth’s rotation. However, these 
result not from applied torques, but from vertical redistribution of mass due to static 
displacements near a fault (Section 4.5). 


Fig, 4,4-5 Schematic approximations made in modeling the seismic 
rupture process. Top: The rupture process involves a complicated slip 
function that is variable in space and time. The scalar seismic moment is 
the integral of this slip process. Middle: To infer source parameters, we 
approximate the rupture as a constant slip D on a geometrically simple 
fault, making the moment a product of the rigidity, average slip, and fault 
area. Bottom: The faulting is further approximated as a double couple of 
equivalent body forces with moment fd. 

Hence the moment tensor of an earthquake represents both 
its fault geometry, via the different components, and its size, via 
the scalar moment. The moment tensor is a simple mathemat¬ 
ical representation that gives the seismic waves produced by a 
complex rupture involving displacements varying in space and 
time on a irregular fault (Fig. 4.4-5). In the previous section we 
approximated the rupture with a constant average displace¬ 
ment D over a rectangular fault, and we now approximate it 
further as a set of force couples. These successive approxima¬ 
tions are usually surprisingly successful at matching observed 
seismograms. 

4.4.S Earthquake moment tensors 

As we have seen, the equivalent body forces for seismic sources 
of different geometries are represented by the seismic moment 
tensor, M, whose components are the nine force couples 
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In this notation, the earthquake in Fig. 4.4-1 is represented as 
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We can write the moment tensor in any orthogonal coordi¬ 
nate system because vector and tensor equations are valid 
regardless of coordinate system. In general, the tensor appears 
more complicated than Eqn 3 if the fault and slip directions 
are not oriented neatly relative to the coordinate system. To see 
this, we write the moment tensor for a double-couple earth¬ 
quake in an arbitrary coordinate system. The components are 
given by the scalar moment and the components of n, the unit 
normal vector to the fault plane, and d, the unit slip vector, 


or 


M = Mr 


2n x d x 

n y d x + n x d y 
n r d v + n v d „ 


n x d y + n y d x 
2n y d y 

n z d y + n y d z 


n x d z + n z d x 
n y d z + n z d y 


(4) 


(5) 


' This formulation shows two important things. First, the 
interchangeability of n and d makes the tensor symmetric 
(M - = Mjj). Physically, this shows that slip on either the fault 
plane or the auxiliary plane yields the same seismic radiation 
patterns. Second, the trace (sum of diagonal components) of 
the tensor is zero, 2 

X M,, = M u = 2 M 0 n i d i = 2M 0 n ■ 3 = 0, (6) 


because the slip vector lies in the fault plane and is thus perpen¬ 
dicular to the normal vector. Flence moment tensors corre¬ 
sponding to slip on a fault plane have zero trace. A nonzero 
trace implies a volume change (explosion or implosion). Such 
an isotropic component does not exist for a pure double-couple 
source. 

Before going further, it is worth briefly considering the 
tensor properties of M». In discussing stress, we noted that a 
matrix of numbers is a tensor only if it transforms between 
coordinate systems in a specific way (Eqn 2.13,18). It is easy to 
prove that the moment tensor for a double couple (Eqn 5) 
transforms in this manner, because it is a physical entity relat¬ 
ing the normal and slip vectors much as the stress tensor relates 
the normal and traction vectors. At deeper level, M- is a tensor 
even for non-double-couple sources because it derives, in a 
complicated way that we will not discuss, from the change the 
earthquake causes in stress integrated over the source region. 
The scalar moment gives the magnitude of the moment tensor 

2 Recall the summation convention notation (Section A.3.5) that a repeated index 
indicates summation. 


M 0 = (XM|) 1/2 /a/ 2, which is analogous to the magnitude of a 

ij 

vector. 

Using the definitions of the normal and slip vectors in terms 
of fault strike, dip, and slip directions (Section 4.2), we can 
write the moment tensor for any fault. The reverse process of 
finding the fault geometry corresponding to a moment tensor is 
more complicated. However, we need this ability for seismo¬ 
gram inversions that yield the moment tensor. This can be done 
using some ideas from linear algebra about vector transforma¬ 
tions (Section A.5), because the eigenvectors of the moment 
tensor are parallel to the T, P, and null axes. 

To show this, we use the fact (Section 4.2.5) that vectors in 
these three orthogonal directions t, p, and b can be written in 
terms of the fault normal, h, and slip vector, d, as 

t = n + d, = + 

p = n-d, p. = w.-d ( ., 

b = n x d, b-s^d^ (7) 

To prove that these are the eigenvectors and to find the 
eigenvalues, we begin with t, a vector in the T axis direction, 
and evaluate 

M ijh = M o(«A + + d i ) 

= + rijdjdj + n^dj + n^dj). ( 8 ) 

Because the normal and slip vectors are perpendicular, 
(n i d i = 0) and have unit length (n i n i — d t d { = 1), we see that 

M { - t i = M 0 {di + n-) = M 0 1-. (9) 

Thus the scalar moment M 0 is the eigenvalue associated with t, 
which is an eigenvector. 

Similarly, for the P axis, 

M ijPi = M o( n i d j + n j d i )(«/ - d i) 

= M 0 (fijtijdi + n i n j d i - - nfid;) 

= M 0 (d rnj )=-M oPl , (10) 

so —M 0 is the eigenvalue associated with p, which is also an 
eigenvector. 

Finally, because M- is a real symmetric matrix, we know that 
a third eigenvector is perpendicular to the first two (Section 
A.5.3). This turns out to be the null axis, b. In Section 4.2.5 we 
showed that the null axis is perpendicular to the P and T axes: 

(l/2)(txp) = -(nxd) =-b. (11) 

To show that b is an eigenvector, we form 

Mub, =M 0 (n t di + d^e^n^) 

= M 0 e lj k(n i d l n j d k + d i n l n j d k ) 

= M 0 [n i n j (e lj k d l d k ) + d i d k (e ljk n,n j )], (12) 


244 Earthquakes 


^v/s/y/v-- 


and recognize that the cross-product of a vector with itself is 
zero, 

£ tjk d t d k = £ ]ki d k d i = d x d = 0, 

£ ijk n l n j = £ k i j n l n j = hxh = 0 , ( 13 ) 

so the null axis b is an eigenvector with associated eigenvalue 0: 


M £/ ^=0. 


(14) 


The fact that the P, T, and null axes are the eigenvectors 
of the moment tensor lets us simplify it by transforming it 
into the “natural” coordinate system whose basis vectors are 
the eigenvectors. Such orthogonal transformations transform a 
tensor from one orthogonal coordinate system to another, such 
that its components change, but its physical meaning does not. 
The transformation matrix with the eigenvectors as columns 
(Section A.5.3), 


U = 


h 

h 2 

b, 


\ 

Pi 

Pi j 


p3 


(15) 


gives a diagonal moment tensor for a double couple in the 
principal axis coordinate system 


U^MU = 


M 0 

0 

0 


v 


0 0 
0 


(16) 


One diagonal element is zero, and the other two are ± the scalar 
moment. The trace (M xx + M + M zz ), which is not changed by 
an orthogonal transformation, started as zero in Eqn 6 and so 
remains zero. Put another way, the isotropic component is an 
invariant of the moment tensor and does not depend on the 
coordinate system. 

The point of the transformation is that inverting seismograms 
in a geographic coordinate system yields the moment tensor 
in that coordinate system. We then find its eigenvectors, the P, 
T, and null axes, and use Eqn 7 to find the fault normal and 
slip vectors and hence strike, dip, and slip angles. As part of the 
same process, the eigenvalues give the scalar moment. 

Thus the moment tensor corresponding to a specific faulting 
geometry can be written in different ways. Figure 4.4-1 shows 
this in a two-dimensional geometry. The coordinate system ori¬ 
ented along and perpendicular to the fault has the fault normal 
and slip vectors as basis vectors, and the nonzero moment 
tensor components are M xy = M yx = M 0 (Eqn 3). If we transform 
the moment tensor to the new (primed) coordinate system with 
the P and T axes as basis vectors, 45° away from the first set, a 
two-dimensional version of Eqn 16 gives the moment tensor 



Fig. 4.4-6 A selection of moment tensors and their associated focal 
mechanisms. The top row shows an explosion {left) and an implosion 
{right). The next three rows are for double-couple sources. The bottom 
two rows show CLVD sources which have a baseball or eyeball/fried-egg 
appearance. (After Dahlen and Tromp (1998), with moment tensors 
transformed to the coordinate system with basis vectors pointing north, 
west, and up. Copyright © by Princeton University Press. Reprinted by 
permission of Princeton University Press.) 


= -Myy = M 0 . The transformation changes the compon¬ 
ents, but the physical moment tensor stays the same, so these 
two different-looking force systems give the same radiated 
seismic waves. Hence the seismic waves alone provide no way 
of deciding which is more “real.” Given that most earthquakes 
occur on faults about which we have other knowledge, we gen¬ 
erally view earthquakes as slip on a fault rather than dipoles. 
It is worth recalling that a similar concept appears whenever 
we transform vector or tensor quantities between coordinate 
systems. For example, Fig. 2.3-6 showed that a given physical 
state of stress could be represented either by normal stresses 
(diagonal terms in the stress tensor) or shear stresses (off- 
diagonal terms in the stress tensor), depending on the coordi¬ 
nate system. 

Figure 4.4-6 shows the diagonalized moment tensor and fo¬ 
cal mechanism for some source geometries. The second, third, 
and fourth rows show end-member double-couple mecha¬ 
nisms. For each, the figure shows a vertical strike-slip (second 
row), vertical dip-slip (third row), and a 45°-dipping pure 
thrust fault. The first and last two rows, however, show very 
different-looking mechanisms, which are discussed next. The 
moment tensors are given in the coordinate system of Section 
4.2.1, with basic vectors pointing north, west, and up. In 
another coordinate system, such as spherical coordinates, the 
components of the tensors would differ. 
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Fig. 4.4-7 An explosive source, which radiates energy equally in all 
directions, is modeled using a triple dipole as an equivalent body force 
system. 

4.4.6 Isotropic and CLVD moment tensors 

If all three diagonal terms of the moment tensor are nonzero 
and equal, the polarity of the first motions (focal mechanism) is 
the same in all directions. Such a triple vector dipole of three 
equal and orthogonal force couples is the equivalent body 
force system for an explosion or an implosion (Fig. 4.4-7). The 
moment tensor looks like 

r E 0 O' 

M= 0 E 0 , (17) 

0 0 £ 

V / 

and has nonzero trace 3£. A moment tensor with a nonzero 
isotropic component represents a volume change. 

Most explosive sources are man-made mining or nuclear 
explosions. The ability to identify and locate nuclear explo¬ 
sions seismologically is critical for monitoring nuclear testing 
(Section 1.2). Natural explosive or implosive sources are rare, 
but may be associated with fluid and gas migration linked 
to magmatic processes or with sudden phase transitions of 
metastable minerals. High-velocity impacts of meteorites could 
also be modeled with explosive sources. 

The physical processes in explosions differ markedly from 
those for earthquakes. An explosion involves a sudden increase 
in pressure, which causes nonlinear deformation that can melt 
and even vaporize rock. As this shock wave of pressure ex¬ 
pands, its amplitude decreases until the deformations are small 
enough to occur elastically, yielding a spherical P wave (Section 
2.4.3). This propagating wave interacts with interfaces within 
the earth, including the surface, and generates SV and Rayleigh 
waves, as seen in the nuclear explosion seismogram in Fig. 1.2- 
19. Surprisingly, SH waves, including Love waves, are also 
observed. These would not be expected in a spherically sym¬ 
metric and isotropic earth, where P-SV and SH waves are 
decoupled. Several possibilities have been suggested, including 
tectonic release of deviatoric stress near the source, essentially 
triggering earthquakes, and giving the source both isotropic 
and double-couple components. 


Another class of non-double-couple seismic sources are com¬ 
pensated linear vector dipoles (CLVDs). These are sets of three 
force dipoles that are compensated, with one dipole -2 times 
the magnitude of the others: 

'-X 0 o' 

M = 0 A/2 0 . (18) 

0 0 A/2 

V / 

The trace of the moment tensor is zero, so there is no isotropic 
component. CLVDs are illustrated by the strange-looking 
bottom two rows in Fig. 4.4-6. By contrast with the beachball¬ 
looking focal mechanisms of double couples, the first motions 
for CLVDs look like baseballs (fifth row) or eyeballs (sixth 
row). Although sources with large CLVD components are 
rare, they have been identified in several complicated tectonic 
environments. 

Two primary explanations have been offered for CLVD 
mechanisms. Especially in volcanic areas, it is natural to 
think of an inflating magma dike, which can be modeled as a 
crack opening under tension. The moment tensor is for such a 
crack is 3 

"a 0 0 ^ 

M= 0 A 0 , (19) 

0 0 A + 2p 

V J 

where A and p are the Lame elastic constants (Eqn 2.3.69). The 
trace of this tensor is 3A+ Ip, which is positive because the crack 
opened. Thus we can decompose the tensor into two terms: 

'A 0 0 'l (E 0 Ol (~2/3p 0 0 

OA 0 = 0 £ 0 + 0 -2/3p 0 . 

0 0 X + 2p 0 0 £ o 0 4/3 p 

V 7 V / v w 

(20) 

The first term is an isotropic tensor whose diagonal compon¬ 
ents £ = A + 2/3/i are one-third of the trace, and the second 
term is a CLVD. Because, as we will see shortly, inversion of 
moment tensors for shallow earthquakes cannot resolve the 
isotropic component, the seismic waves from such a crack 
would look like a CLVD. 

An alternative explanation is that CLVDs are due to near- 
simultaneous earthquakes on nearby faults of different geo¬ 
metries. For example, consider the sum of two double-couple 
sources with moments M 0 and 2M 0 , expressed in the principal 
axis coordinate system (Eqn 16): 

r M 0 0 0 Wo 0 0 Wm 0 0 o' 

0 0 0 + 0 -2M 0 0=0 -2 Mg 0 . 

0 0 -M„ 0 0 2M n 0 0 M 0 

V \ V u / 

( 21 ) 

3 Aki and Richards (1980). 
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Fig. 4.4-8 CLVD-type focal mechanisms for 
earthquakes near the Bardarbunga volcano 
in Iceland. The mechanisms are similar to 
those shown in the lower right of Fig. 4.4-6. 
These are thought to reflect reverse faulting 
on cone-shaped ring faults surrounding the 
magma chamber. In this model, deflation of 
the magma chamber increases horizontal 
compression, so the roof block above the 
magma chamber subsides with respect to 
the surrounding rock (right). (Nettles and 
Ekstrom, 1998./. Geophys . Res., 103, 17, 
973-83, copyright by the American 
Geophysical Union.) 


Thus, adding these two double couples yields a CLVD. In this 
example, both double-couple moment tensors are diagonal and 
so have the same eigenvector directions, but the P, B, and T axes 
of the first are the T, P, and B axes of the second. Thus, if the 
first earthquake were strike-slip on a vertical fault, the second 
would be normal faulting on a 45°-dipping fault (Fig. 4.2-16). 

Decomposing a CLVD into double couples bears out the 
concept that the moment tensors can be decomposed in differ¬ 
ent ways, with different interpretations. This is because the 
moment tensor represents the equivalent body force system, so 
different decompositions reflect the same force system and give 
the same seismic waves. Hence the seismic waves alone cannot 
distinguish between alternative decompositions. 

Multiple faulting events giving rise to apparent CLVDs have 
been reported. For example, Fig. 4,4-8 shows CLVD mecha¬ 
nisms at a volcano in Iceland, which have been interpreted 
as resulting from reverse faulting on cone-shaped ring faults 
beneath the caldera, triggered by deflation of the magma 
chamber. Such CLVDs and other non-double-couple seismic 
sources, like the single force for Mt St Helens (Fig. 4.4-2), occur 
in volcanic regions where faulting and magmatic processes 
interact. It is often difficult to distinguish the roles of the two 
processes, even when geological and other geophysical data are 
also used. Hence different interpretations of seismic events 
have been offered in areas including Hawaii and the Long 
Valley, California, caldera. 

4.4.7 Moment tensor inversion 

In addition to being an elegant representation of the source, the 
moment tensor has two advantages for source studies. First, it 
allows us to analyze seismograms without assuming that they 
result from slip on a fault. In some applications, such as deep 
earthquakes or volcanic earthquakes, we would like to identify 
possible isotropic or CLVD components. Second, the moment 


tensor makes it easier to invert seismograms to find source 
parameters. 

For example, consider the formulation we used to synthesize 
surface waves (Section 4.3.4). The predicted seismograms 
depended on fault geometry factors that are complicated pro¬ 
ducts of trigonometric functions of the fault strike, dip, and slip 
angles. This is not a problem in forward modeling, but makes 
it hard to invert the seismograms to find the fault angles. The 
inverse problem is much easier if we write the seismograms as 
linear functions of components of the moment tensor. 

To see this, we represent the source by a vector m, containing 
components of the moment tensor. Although the tensor has 
nine components, only six are independent, because the tensor 
is symmetric. We then extend the idea of a Green’s function 
which we previously used to represent the effect on a seismo¬ 
gram of an earthquake with a particular fault geometry 
(Eqn 4.3.15). Here, we define G-(t) as the seismogram at the i th 
seismometer due to the moment tensor component m-. G-(t) 
includes the effects of the seismometer and earth structure 
along the path from the source to this seismometer, so the z th 
seismogram is the sum of the Green’s functions weighted by the 
moment tensor components, 

6 

u i (t)=£G i j(t)m / . (22) 

;=i 

Because we have many seismograms, we can write this as a 
vector-matrix equation 

u = Gm, (23) 

where u is a vector composed of the seismograms at n stations 
and G is the Green’s function matrix. G has as many rows as 
seismometers and as many columns as moment tensor compo¬ 
nents, so Eqn 23 looks like 
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(24) 


This is an overdetermined system of linear equations with 
more equations (n) than unknowns (6). We often encounter 
such systems when we invert large quantities of data to estim¬ 
ate a smaller number of parameters. As we noted in Section 2.8, 
and will explore in depth in Chapter 7, we cannot invert the 
matrix G because it is not square. Instead, we find the moment 
tensor that best matches the observed seismograms in a least 
squares sense, using what is called the generalized inverse of G, 


m = (G T G)- 1 G T u. 


(25) 


Thus, because the seismograms are linear functions of the 
moment tensor components, they can be inverted to find the 
tensor components. 

Although we defer discussing most properties of general¬ 
ized inverse solutions until Chapter 7, a point worth noting 
is that how well we can estimate a moment tensor component 
depends on the Green’s function. Equation 22 shows that the 
seismogram involves products of moment tensor components 
with their corresponding Green’s functions. Thus, if G R is zero, 
ntj has no effect on the seismogram, no matter how nig it is. 
Similarly, if G f • is small, m- has little effect on the seismogram, 
Conversely, inverting the seismogram to determine w. essen¬ 
tially involves dividing the seismogram by G^. Hence, if G,y is 
small, dividing by it gives a large number, so any small errors 
or noise in the data produce spuriously large values of m -. Put 
another way, we get good estimates of components to which 
the seismogram is fairly sensitive, but poorer ones for compon¬ 
ents on which the seismogram depends weakly. 4 

We now consider one inversion approach, a method for sur¬ 
face waves corresponding to the forward modeling in Section 
4.3.4. In a coordinate system with the source at the north pole, 
the vertical component of Rayleigh waves on a seismometer at 
r = (r, 0, 0) can be written as an inverse Fourier transform: 


«(r, t) = 


In 


U{co , 0, <p)e im dQ). 


(26) 


The spectral amplitude U(ft), 0, 0) is a complex number repre¬ 
senting the source, the effect of the seismometer, and the elastic 


4 We will formalize this idea using eigenvalues in Section 7.3, but before doing so, 
we can see intuitively that an estimate of the number of white cats in a dimly lit room 
will be better than that of black cats. 


and anelastic effects of propagation from the source to the re¬ 
ceiver. As in Eqn 4.3.20, we write 

u(co, 0, 0) = V(w, 0)H(co, 0), 


H(a), 0) = I{co) 


Vsin 0 


g-iCQa6/Cg-(oad/2QugimK/2 


(27) 


V{co, 0) is the radiation pattern term reflecting the effect of 
source geometry, which we want to find, whereas H{a, 0) rep¬ 
resents the effects of the seismometer and of propagation, 
which we treat as known. 1(g)) is the effect of the seismometer, 
and the remaining terms are propagation effects, including 
e -coad/2Qu ^ t h e e ff ect 0 f attenuation as the wave travels a dis¬ 
tance 0 (including any 2n terms). In these expressions, a is the 
earth’s radius, m is the number of polar or antipolar passages, 
and c, u , and Q are the phase velocity, group velocity, and 
attenuation at the frequency ft). 

To set up the inversion, we write the radiation pattern, which 
shows how the amplitude at a given frequency varies with the 
azimuth (0) of the receiver from the source, in terms of linear 
combinations of the moment tensor components 


V(ft), 0) = -P R 


M xy sin 20 - — (M yy - M xx ) cos 20 


+ i(5 R + N r )M zz + i(2N R - S R )(M XX + M„) 

3 6 

+ iQ R (M yz sin 0 + M xz cos 0). (28) 


This expression is analogous to the radiation pattern for 
Rayleigh waves due to slip on a fault (Eqn 4.3.20). The dif¬ 
ference is that here the seismic source is written in terms of 
moment tensor components, rather than as products of trig¬ 
onometric functions of the fault strike, dip, and slip angles. 
Thus Eqn 28 represents more general seismic sources than 
double couples due to slip on a simple fault. 

As before, the radiation pattern depends on excitation func¬ 
tions derived from the radial eigenfunctions of spheroidal 
modes of the appropriate frequency, which describe how a 
source at a given depth causes displacements as a function of 
frequency. However, in addition to the excitation functions 
in (Eqn 4.3.20) (P R , S R , Q R ), we have the excitation func¬ 
tion N r that applies to an isotropic source. To see this, recall 
that for an explosion the moment tensor (Eqn 17) has equal 
diagonal elements ( M xx = M yy = M zz = M 0 ) and zeroes off the 
diagonal ( M xy = M yz = M xz = 0). Substituting these into Eqn 28 
yields V(ft), 0) = M 0 N R , which is a radiation pattern that 
depends on N R and is azimuthally symmetric, as expected for 
an explosion. Conversely, if the source has no isotropic com¬ 
ponent ( M xx + M yy + M zz = 0), N r dr ops out of Eqn 2 8. 

We can formulate the inverse problem using Eqn 28. At a 
given frequency, separating V(ft), 0) into real and imaginary 
parts yields the matrix equation 
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Re(V(fi), 0)) 
Im(V(©, 0)) 


= jBm, 


(29) 


where m is a vector composed of the moment tensor compon¬ 
ents that we seek to find, 


m = 


M xy 

M yy ~~ M, 
M zz 

M vv + M 


yy 


M. 

M, 


yz 


xz J 

and B is the known matrix 


B = 


- P R sin 20 — cos 20 i (S R + N R ) 


— (2N r - S R ) 0 
6 


0 


Q r sin 0 Q r cos 0 


(30) 


(31) 




-P R sin 20j ^ cos 20! 1 (S R + N R ) 


-P R sin 2tj> n -&■ cos 20„ — (S R + N R ) 


^(2N r -S r ) 


0 


Qr sin 0! Q r cos 0 x 


^(2 Nr-S r ) 

0 


Qr sin <t>n Qr cos </>, 




(34) 


With more than three stations, there are more equations than 
unknowns and Eqn 32 is solved using the least squares solution 
(Eqn 25), giving 


containing the excitation functions and azimuthal dependence. 

To invert seismograms for the moment tensor, we divide the 
Fourier transform of the seismogram from the station at r- by 
the propagation and seismometer term H(a), 0 t ) (Eqn 27) to 
find the complex amplitude V(a), 0 ; ). Data from only one seis¬ 
mic station yields two equations in six unknowns, so we cannot 
find m. However, with data at three or more stations, all six 
components of m can in principle be found. We form a vector 
v from the V(ty, 0 ( .) values observed at each of the n stations. 
We similarly use the values of B for each station, and write a 
vector-matrix equation equating the observed amplitudes v to 
those predicted by the known matrix B and the moment tensor 
m that we seek, 


m = (B t B)~ 1 B t y. (35) 

This solution gives the moment tensor that best predicts the 
observed spectral amplitudes. It is estimated at a given fre¬ 
quency for a seismic source that is a delta function in time. 
Time variation in the source can be examined by solving for m 
at different frequencies. 

An important limitation results from the fact that the middle 
two columns in matrix B, corresponding to M zz and M xx + M , 
do not contain 0, and so have no azimuthal variation. There¬ 
fore, no matter how many stations we use at a given frequency, 
we are solving only for the sum 


v = Bm, 
where 

^ReV(a>, 0^ 
ImV(fi), 0 a ) 


v = 


ReV(ft), 0J 


< 32 > i(S R + N R )M H(2N s -S K )(M ra + M w ), (36) 

3 6 

which is the same at all stations. The inversion thus cannot 
find M xx + M yy and M zz separately, but only their sum, which is 
the isotropic portion of the source corresponding to possible 
volume changes. 

One way to deal with this problem is to use data at different 
(33) frequencies, where the coefficients of M zz and M xx + M yy are 
quite different. This is often difficult because these coefficients 
vary slowly with frequency for shallow earthquakes (consider 
S R {co) for the 11 km-deep earthquake in Fig. 4.3-14). Surface 
wave moment tensor inversions thus often constrain the source 
to have no isotropic portion, so that M xx + M yy = -M zz . In this 
case, 


and 
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V r (ca, 0) = -P R 
1 


M xy sin 20 - - (M yy - M xx ) cos 20 


5 R (M yy + M xx ) + fQ R (M™ sin 0 + M X2 cos 0), 


(37) 


so the inversion is for a vector with five components. N R , the 
excitation function for an isotropic source, no longer enters 
into the radiation pattern. 

We then rewrite the inversion equation (Eqn 32) as 


v = Am, 
and solve for 
/ M„, 


(38) 


m = 


M yy - 

Myy + M xx 


(39) 


M yz 

v M X2 , 
given the known matrix 


A = 


sin 20! -P R cosl^ --.R 


--S 


0 


0 


1 1 

-P R sin 20 2 —P R cos 20 2 


0 


0 


0 0 

0 GftSin 0i QrCOS^ 

0 0 

0 Q r sin 0 2 Q r cos0 2 


-P R sin 2<p n A R cos20 K -A R 0 0 

v 0 0 0 Q R sin0„ Q R cos0„ y 

(40) 

The solution gives five moment tensor components, because 
adding and subtracting m 2 and ra 3 yields M xx and M yy . 
is then found from ~{M XX + M yy ), but is not independent of 
them. 

Another significant difficulty in surface wave moment tensor 
inversion stems from the fact that the excitation function Q R 
is zero at the earth’s surface (Fig. 4.3-14) because it is pro¬ 
portional to the shear stress. At shallow depths Q R is small, so 
M xz and M are poorly determined for shallow earthquakes 
(< 30 km when inverting at 256 s). This leaves only three tensor 
components well determined, which are insufficient to determ¬ 
ine the fault geometry. 





This problem can be addressed in several ways. One is to 
invert shorter-period waves which have larger amplitudes 
(Fig. 4.3-14). However, the effects of lateral heterogeneity 
increase for shorter periods, due to the shorter wavelengths. 
A second approach is to constrain M xz and M to be zero and 
invert for only the three components M xx , M , M xy . This forces 
one eigenvector to be vertical and makes the major double 
couple take one of three forms: pure strike-slip on a vertical 
plane (vertical null axis), thrust faulting on a 45°-dipping plane 
(vertical T axis), or normal faulting on a 45°-dipping plane 
(vertical P axis). An interesting way to view this is to note 
that shallow earthquakes on vertical dip-slip faults, for which 
the only nonzero fault geometry factor (Eqn 4.3.20) is q R , have 
radiation patterns proportional to Q R {co) and so excite surface 
waves very inefficiently. Hence constraining M xz and M yz to be 
zero excludes any vertical dip-slip component from the focal 
mechanism, so a complete solution requires other data, such 
as first motions or geological knowledge. A third method is to 
constrain one nodal plane from first motions and then do a 
linear inversion for the second plane. 

We can also use this formulation to invert transverse com¬ 
ponent Fove wave data, using the analogous expressions 

p-inlA 

U{CO , 0, 0) = V(C0, 0)I(<y)—- e -i^lc e -<oaei 2 Qu e imicl 2 i ( 41 ) 

^/sin 0 


~{M XX - M yy ) sin 20 - M xy cos 20 
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+ iQd-M xz sin 0 + M yz cos 0]. 


(42) 


4.4.8 Interpretation of moment tensors 

In general, once a moment tensor has been found by inverting 
seismograms, it will be more complicated than expected for a 
double couple. Even if the source were a pure double couple, 
noise in the data and imperfect knowledge of earth structure 
would likely produce a tensor that, once diagonalized, would 
look like 


M = 


0 

a 2 

0 


0 

2 3 

v 


|Ai|>|A 2 |>|A 3 |, 


(43) 


with eigenvectors h v n 2 , and n 3 . 

If M represents a double couple, then X 1 = -X 2 , and X 3 = 0. 
However, unless the moment tensor was constrained to satisfy 
these conditions, it generally will not do so. In most cases, 
X 1 ~ -T 2 , and \X 2 \ » \ X 3 1, so M is approximately, but not 
exactly, a double couple. In this case, we interpret the moment 
tensor by decomposing it, as we did for the CEVD examples in 
Section 4.4.6. If there is an isotropic component, we remove 
it via 
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where E = (A 1 + A 2 + A 3 )/3. The remaining term is a deviatoric 
moment tensor, with zero isotropic component and compon¬ 
ents equal to the deviatoric eigenvalues X\ = X 1 - £, A 2 = A 2 - E, 
and A 3 = A 3 - £. If needed, the deviatoric eigenvalues are 
renumbered so that | A^ | > | A 2 1 > | A 3 1. If the inversion has no 
isotropic component, the deviatoric moment tensor is the 
moment tensor resulting from the inversion. 

The deviatoric moment tensor can be decomposed in several 
ways. One is in terms of two double couples, called the major 
and minor double couples: 
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The first tensor is the major double couple, with scalar moment 
| X[ |, and the second is the minor double couple, with scalar 
moment |A 3 |. Usually, the magnitude of the major double 
couple is much larger, and we treat it as the earthquake’s 
source mechanism. 

As an example, consider M for an intermediate-depth thrust 
earthquake in the Kuril subduction zone near Japan . 5 The 
moment tensor inverted from Rayleigh waves of period 256 s 
recorded on the IDA network of digital very long-period seis¬ 
mometers was 


M = 


f 0.12 

-0.17 

-0.06 
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-0.17 

-1.54 

-1.44 


-0.06" 
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/ 


(46) 


where the components are in units of 10 27 dyn-cm. Diagon¬ 
alizing the matrix yields 
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with eigenvectors fq = (0.80, 0.92, 0.37), n 2 = (0.00, -0.38, 
0.93), and n 3 = (-0.99, 0.07, 0.03). The isotropic component 
was constrained in the inversion to be zero. Because the minor 
double couple has a moment only 6 % that of the major double 
couple, we assume that the major double couple represents the 
earthquake mechanism, fq is the P axis of the double couple, 


5 This example was provided by A. Michael. 


n 2 is the T axis, and h 3 is the null axis. Using these axes and our 
tectonic preconceptions (which are not, of course, always 
valid), we decide (using a stereonet or a computer) that the 
earthquake was a thrust on a fault plane striking N189°E 
and dipping 23°W. The auxiliary plane strikes N3°E and 
dips 67°E. 

In this case, we discarded the minor double couple and 
assumed that the earthquake was a single double couple. It is 
likely that the minor double couple often results from lateral 
heterogeneity in the earth (the velocity and attenuation models 
used in this inversion were laterally homogeneous), noise in 
the data, and deviation of the earthquake from a point source. 
You may recall from the surface wave example in Fig. 4.3-13 
that the data were approximately fit by the amplitude radiation 
pattern predicted by the focal mechanism, but some stations 
had higher amplitudes, whereas others had lower amplitudes. 
Similar effects can occur for the amplitude and phase data in a 
moment tensor inversion. As a result, even if the source were a 
pure double couple, the inversion fits the deviations in the data 
from the predictions of the best-fitting double couple, and so 
yields a moment tensor differing somewhat from the double 
couple. Thus the better the inversion method reflects the earth’s 
heterogeneity and source complexity, the less the tendency for 
there to be spurious portions of the moment tensor. In some 
cases, however, the minor double couple may have physical 
significance, such as for simultaneous ruptures on nearby faults 
with different orientations. 

The moment tensor can be decomposed in other ways. One 
is into a double couple and a CLVD: 


'a; 

0 

0 ^ 


"a; + a 3 /2 

0 

0 " 

0 

a 2 

0 

= 

0 

-a; - a 3 /2 

0 

0 

V 

0 

V 


0 

V 

0 

0 

y 


(-A3/2 


+ 


V 


0 

0 


0 

-A3/2 

0 


0 

0 


■3 


a; 


y 


(48) 


The relative strength of the double couple and CLVD is given 
by the ratio of the smallest and largest deviatoric eigenvalues, 
£ = A 3 /A 3 . £ = 0 indicates a pure double couple, and £ = ±0.5 
shows a pure CLVD source. About 4% percent of the mechan¬ 
isms in the Harvard global moment tensor catalog, derived 
from inversions that are not constrained to yield double 
couples, have | £ \ > 0.3. Some of these may be artifacts of the 
inversion process similar to spurious minor double couples, 
but some appear to be real source effects. 

However, as our CLVD example (Section 4.4.6) showed, 
both moment tensor decompositions and their interpretations 
are not unique. For example, Eqn 45 showed a decomposi¬ 
tion into a major double couple with moment X\ and a minor 
double couple with moment A 3 . We could also decompose the 
tensor with the same major double couple but a minor double 
couple with moment A 2 : 
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The two decompositions sum to the correct value for each 
tensor component, which is the equivalent body force, but 
using tensors of differing scalar moments. This is analogous 
to the way a vector can be decomposed into various sums of 
vectors with different magnitudes. 

Moment tensor solutions have become an important tool of 
global seismology. Globally distributed broadband digital 
seismometers permit reliable focal mechanisms to be generated 
within minutes after most earthquakes with M s > 5.5 and made 
publicly available through e-mail and the Internet. Several 
organizations carry out this service, including the Harvard 
centroid moment tensor (CMT) project. The CMT method 
inverts two parts of seismograms: long-period (T > 40 s) body 
waves and very long-period (T > 135 s) surface waves, called 
mantle waves. The CMT inversion yields both a moment 
tensor and a centroid time and location. This location often 
differs from that listed in earthquake bulletins, such as that of 
the International Seismological Centre (ISC), because the two 
locations tell different things. Earthquake location bulletins 
based upon arrival times of body wave phases like P and S give 
the hypocenter : the point in space and time where rupture 
began. CMT solutions, using full waveforms, give the centroid , 
or average location in space and time, of the seismic energy 
release. As a result, CMT origin times are almost always 
later than ISC times. The availability of large numbers of high- 
quality mechanisms (the Harvard project has produced more 
than 17,000 solutions since 1976) is of great value in many 
applications, especially tectonic studies. 

4.5 Earthquake geodesy 

4.5.1 Measuring ground deformation 

So far in this chapter we have studied earthquakes using trans¬ 
ient displacements due to the propagating seismic waves they 
generate. However, the large, rapid deformation in an earth¬ 
quake results from a complex deformation field which extends 
over a broad region and a long time. Hence, additional inform¬ 
ation about earthquakes and the processes causing them can 
be obtained by measuring slow ground deformation using 
techniques from geodesy , the science of the earth’s shape. 
Most such techniques rely on detecting the motion of geodetic 
monuments, 1 which are markers in the ground. 

1 The most familiar monuments are the metal disks attached to rocks often seen at 
mountain peaks, but various other designs are also used in hope of minimizing the ef¬ 
fects of soil or near-surface motion that mask the tectonic movement. In soft sedi¬ 
ment, monuments are often steel rods driven deep into the earth. The popular term for 
monuments is “benchmarks,” although geodesists traditionally reserve this term for 
monuments used to study vertical motions. 


Until recently, these measurements were typically made by 
triangulation, which measures the angles between monuments 
using a theodelite, or trilateration, which measures distances 
with a laser. Vertical motion was measured by leveling, using a 
precise level to sight on a distant measuring rod. However, the 
advent of geodetic methods using signals from space permits all 
three components of position to be measured to sub-centimeter 
precision. As a result, geodetic data before and after earth¬ 
quakes now give coseismic motion to high precision much 
more easily than was previously possible. 

Although the space-based technologies are among the most 
complex used in the earth sciences, in essence they use electro¬ 
magnetic waves in ways analogous to those we have discussed 
for seismic waves. Three of these techniques are used to locate 
geodetic markers. Very Long Baseline Interferometry (VLBI) 
uses the difference in the time when radio signals from distant 
quasars arrive at different points on earth. Satellite Laser 
Ranging (SLR) uses the time required by light from ground- 
based lasers to bounce off satellites. The third approach relies 
on the travel time of radio signals between satellites and ground 
stations. 

Although the various systems provide similar data, the third 
approach via the Global Positioning System (GPS) 2 is presently 
the system of choice for most tectonic applications. GPS was 
developed in the late 1970s by the US Department of Defense 
for real-time positioning and navigation. A constellation of 
satellites transmit coded timing signals on a pair of micro- 
wave carrier frequencies synchronized to very precise on-board 
atomic clocks. The timing signals are modulations of the car¬ 
rier frequencies, analogous to those we discussed in the context 
of phase and group velocities (Section 2.8.1). By determining 
the ranges to a minimum of four satellites from the signal 
delays and the broadcast satellite orbit information, a single 
GPS receiver can determine its three-dimensional position 
to a precision of 5 to 100 meters, depending on the level of 
signal degradation imposed by the military (Pig. 4.5-1 ). 3 This 
operation is conceptually the same as locating an earthquake 
from arrivals at multiple seismometers, which we discuss in 
Section 7.2. GPS positions are two to three times more precise 
in the horizontal than in the vertical direction, because radio 
signals arrive only from above, just as earthquake locations are 
less precise in depth because waves arrive only from below. 

The improvement to cm level or better precision is obtained 
by using the phase delays of the microwave carriers. Because 
the carriers have higher frequencies than the modulations, 


2 Acronyms abound in space geodesy, given its space and military origins. Alterna¬ 
tive meanings have been offered: the large VLBI project teams suggest “Very Large 
Bunch of Investigators, ” and the languid pace of GPS surveys prompted “ Great Places 
to Sleep.” There are also second-level acronyms involving other acronyms, such as 
IGS for International GPS Service. 

3 The Department of Defense can degrade GPS positioning via selective availability, 
which introduces errors in the satellite clocks. This capability, which was discontin¬ 
ued in May 2000, reduced the precision of single receiver positions but had little effect 
on precise geodetic positions. 
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Constellation of GPS satellites 


Relative positioning 


Fig. 4.5-1 Left: The Global Positioning 
System (GPS) uses a constellation of 
satellites that transmit timing signals. 
Right: Using precise positions based on 
signals from multiple satellites recorded at 
multiple receivers, measurements over time 
yield relative velocities to precisions of a 
few mm/yr or better. 


their phase can yield more precise locations, much as higher- 
frequency seismic waves can reveal more detailed velocity 
structure (Section 3.2.3). The carrier wavelengths are 19 and 
24 cm, so precise phase measurements can resolve positions to 
a fraction of these wavelengths. The use of differential signals 
from multiple satellites recorded at multiple receivers reduces 
clock errors. Combining both transmitted frequencies removes 
the effects of the passage of the GPS radio signals through the 
ionosphere. Position errors due to signal delays from water 
vapor in the troposphere can be reduced by estimating the 
delays using an inversion process similar to solving for seismic 
velocity structure. 

The final element for high-precision surveys is provided by 
continuously operating global GPS tracking stations and data 
centers. These provide high-precision satellite orbit and clock 
information, earth rotation parameters, and a global reference 
frame. Using this information, GPS studies can achieve posi¬ 
tions better than 10 mm, so measurements over time yield rela¬ 
tive velocities to precisions of a few mm/yr or better, even for 
sites thousands of kilometers apart. The uncertainty of the ve¬ 
locity estimate depends on the precision of the estimated posi¬ 
tions and the time interval between them. 

GPS data are collected in two modes. In survey mode, GPS 
antennas are set up over monuments for short periods, and the 
sites are reoccupied later. Alternatively, continuously record¬ 
ing GPS receivers are installed permanently. Continuous GPS 
can provide significantly more precise data, albeit at higher cost 
(in the USA, a 25-station network can presently be occupied in 
survey mode for the cost of a single continuous station). 

The biggest limitation of geodetic data for earthquake studies 
is that the positions of geodetic markers before the earthquake 
are needed. Thus effort and resources are required to install 
and survey monuments in advance, in hopes that an earthquake 
will occur nearby. In active seismic areas that are convenient 
for study, this condition can sometimes but not always often 
be met. A way around this difficulty is provided by Synthetic 
Aperture Radar interferometry (InSAR) from satellites. 



Fig. 4.5-2 Left: Geometry of radar imaging from space. A physical 
antenna’s angular resolution is 9 d — X!d = x/r , so x is the resolution on the 
earth s surface achievable by a radar with antenna length / and wavelength 
A operating at altitude r. Synthetic aperture radar dramatically improves 
the resolution. Right: Geometry of the InSAR method. The insert 
illustrates the relation between the crustal motion D and the resulting 
range change §r={D • r). (After Biirgmann et al, 2000. Reproduced with 
the permission of Annual Reviews, Inc.) 

The synthetic aperture method allows high-resolution radar 
mapping from spacecraft or aircraft. The resolution of a phys¬ 
ical radar can be estimated using the single slit diffraction con¬ 
cept (Fig, 2.5-18), in which the angle 9 d between successive 
zeros in the diffraction pattern is XId , where d is the slit width, 
and X is the wavelength. For radar, d is the antenna length, so a 
radar a distance r above the earth’s surface could resolve 
objects of size x, where (Fig. 4.5-2, left) 


0j = X/d = x/r. 


( 1 ) 
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Because the radar wavelengths are 10s of centimeters, a radar 
antenna a few meters long orbiting hundreds of kilometers 
above the earth can normally resolve topography only on a 
scale of kilometers. However, SAR uses signal processing to 
combine information collected by a moving satellite to simu¬ 
late an antenna much larger than the satellite’s real antenna. 
For example, a real 10 m antenna can be used as a 4 km 
synthetic antenna. The synthetic antenna can thus resolve both 
topography and crustal deformation on a “footprint” of tens 
of meters. 

Figure 4.5-2 (right) illustrates the technique. The phase 
difference between radar signals with wavelength X reflected 
from the earth’s surface and recorded by antennas at position 
A 1 and A 2 is 

<f>=(4n/X)(r 2 -r 1 ), (2) 

where r { is the range from the antenna at A i to the reflection 
point. The antenna baseline separation vector B and satellite 
flight height H are known from the satellite orbits. Because 
the baseline length | B | is much shorter than the ranges r i9 an 
analysis like that used to derive the earthquake rupture time 
(Fig. 4.3-2) shows that the elevation of the reflecting point is 
h - H _ r 1 cos 0 , so topography can be mapped from space. 
This method, called interferometry, 4 is used for both earth and 
planetary mapping, such as the Magellan mission to Venus. 

Two such radar images can detect ground motion between 
successive measurements. If differences in satellite positions 
between the measurements are removed, a vector surface 
displacement D causes a phase change 

(4n/X)8r, Sr=(D- r), (3) 

where dr is the projection (scalar product, Section A.3.3) of 
the vector displacement along r, the look direction connecting 
the satellite and reflection point. To find the full displace¬ 
ment vector, observations from ascending (moving north) and 
descending (moving south) tracks of the satellite, or different 
satellites, can be combined. 

The results are shown as a phase difference map, called a 
differential interferogram. Figure 4.5-3 (top) shows such an 
image of the phase differences resulting from the 1992 Landers 
(M w 7.3) and Big Bear (M w 6.2) earthquakes in the Mojave 
desert of southern California. A range change 8r of 2/2 causes a 
phase change of In that appears as one fringe (full shading 
change) in the map. In this case, the C-band radar has a 


4 Interferometry, using phase differences of traveling waves to make precise distance 
and time measurements, has many applications. In seismology, the time between 
arriving waves is measured by cross-correlation (Sections 3.3.6, 6.3.4). GPS and VLBI 
use the phase differences of radio waves to measure positions. Perhaps the most 
famous application of interferometry is the Michelson-Morley experiment in the 
1880s, which showed that the speed of light was the same in all directions despite the 
earth’s motion through space, and thus played a key role in the birth of the theory of 
relativity. 



Fig. 4.5-3 Top: SAR interferogram constructed from radar images taken 
on April 24,1992, and June 18,1993, showing the displacements 
resulting from the 1992 Landers and Big Bear earthquakes. The shaded 
fringes are interference patterns obtained by comparing the images. Each 
cycle of shading represents 28 mm of change in the distance between the 
satellite and the ground, so the static displacement is on the order of tens 
of centimeters. Bottom: Synthetic interferogram computed using a model 
of the static displacements predicted by the focal mechanisms. The images 
are 92.2 km across in width. (B. Hernandez, personal communication, 
1999, based upon Hernandez et al ., 1997. Geophys. Res . Lett., 24 ,1579- 
82, copyright by the American Geophysical Union.) 
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frequency of 5.2 GHz, so a fringe corresponds to 28 mm of 
motion. The observed fringe pattern is coherent over large 
areas where deformation is resolved. The pattern is reason¬ 
ably similar to a synthetic interferogram (Fig. 4.5-3, bottom) 
generated for a detailed model of the Landers rupture, which 
involved several meters of right-lateral strike-slip on a complex 
set of NW-striking faults extending for about 85 km. 

InSAR has several attractive features for earthquake studies. 
Although radar images before an earthquake are needed, 
satellites can acquire them over areas far too large for geodetic 
monuments to have been installed everywhere. In addition, 
InSAR maps deformation on a spacing of tens of meters, far 
denser than is practical with geodetic monuments. Moreover, 
InSAR is especially sensitive to vertical motions, the compon¬ 
ent for which the GPS is the least precise. InSAR has several 
limitations. It recovers motion only in the look direction. It 
cannot be used in some areas of steep topography, where the 
radar beam cannot penetrate, or where the slope facing the 
radar is so steep that several points have the same range to 
the radar. Another limitation is that nontectonic changes 
between images, such as those due to vegetation growth or 
weather conditions (which affect radio wave propagation in 
the atmosphere), can mask the effects of crustal motion. How¬ 
ever, when such decorrelation between successive images is not 
a problem, as in deserts or other bare rock settings, InSAR is a 
powerful tool. Finally, InSAR provides relative changes within 
an image that is tens to a hundred kilometers across, but does 
not provide absolute positions on a plate-wide or global scale. 
This poses no problems for individual earthquake studies, but 
means that it alone cannot be used for large-scale applications 
like plate boundary studies. In many applications, InSAR and 
GPS are both being combined with seismological data. These 
techniques are also being applied together with seismology to 
study ground deformation at volcanoes. 

The advent of space-based methods like GPS and InSAR, 
which make collecting geodetic data faster and easier, have 
made earthquake geodesy and seismic wave studies common 
overlapping approaches to earthquake studies. Hence, although 
seismology and earthquake geodesy were long viewed as very 
distinct, owing to their different instrumentation, earthquake 
geodesy is increasingly viewed as very low-frequency seismo¬ 
logy (or earthquake seismology as high-frequency geodesy). 

4.5.2 Coseismic deformation 

Seismic source theory shows that the static coseismic dis¬ 
placements produced by earthquakes have radiation patterns 
analogous to the propagating wave displacements shown 
in Fig. 4.2-6 and 4.2-7, and so can also provide important 
information about the fault geometry and slip. An important 
feature of these displacements is that they contain 1/r 2 terms, 
compared to 1/r terms for the propagating waves (Eqns 1 
and 2). Thus, compared to the propagating waves, the static 
displacements decay more rapidly with distance from the 
earthquake. Hence we typically describe the static displace- 



Fig, 4.5-4 Top: Horizontal static displacements following the 1927 
Tango, Japan, earthquake. The dashed line shows the fault trace. 
{Bottom): Decay of fault-parallel displacements with distance 
perpendicular to the fault. (After Chinnery, 1961. © Seismological 
Society of America. All rights reserved.) 


ments using Cartesian coordinates near a fault, rather than the 
spherical coordinates used for teleseismic waves. 

A classic example, shown in Fig. 4.5-4, is that of the static 
displacements following the 1927 M $ 7.5 Tango, Japan, earth¬ 
quake. The displacements change direction across the fault 
trace, showing that the earthquake involved primarily left- 
lateral strike slip. The fault-parallel displacement component 
decays rapidly with distance from the fault. 

Although the full expressions for the static displacements 
due to slip on a fault are complicated, we can gain considerable 
insight from the simple case of pure strike-slip faulting on an 
infinitely long vertically dipping fault. In this case (Fig. 4.5-5, 
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Fig. 4.5-5 Top : Geometry of the vertically dipping strike-slip fault model. 
L and W are fault length and width. Bottom : Predicted fault-parallel static 
displacements, normalized to the maximum offset, for an infinite strike- 
slip fault, for different fault widths. 

£op), the fault-parallel displacement in the x direction, u(y), 
varies with the distance from the fault y as 

u(y) = ±DI2-(Dln) tan -1 (ylW), (4) 

where D is the slip across the fault, and W is the depth to which 
faulting extends, called the fault width. The ±D term is positive 
for y > 0, negative for y < 0. This model assumes that the slip is 
uniform all over the fault plane. Figure 4.5-5 (bottom) shows 
this solution for several different fault widths. Near the fault, 
y —» 0, so the inverse tangent is zero, and u( 0) = ±D! 2. The dis¬ 
placement decays away from the fault, so by a distance equal 
to the fault width ( ylW = 1) the inverse tangent is jt/4 and the 
displacement is D/4, or half that at the fault. Far from the fault, 
ylW —> 0, and the displacement dies off. Hence the distance 
over which the displacement extends gives information about 
the fault width. For example, the data in Fig. 4.5-5 indicate a 
fault width of about 10 km. 

For this infinite fault, fault-parallel displacement extends to 
infinity along the fault. Calculations for finite-length faults 





Fig. 4.5-6 Top : Fault-parallel static displacements for a finite vertically 
dipping strike-slip fault. Contours are labeled in units of 10" 3 times the 
maximum offset. (Chinnery, 1961. © Seismological Society of America. 

All rights reserved.) Center: Predicted fault-parallel static displacements, 
normalized to the maximum offset, for strike-slip faults with different 
fault widths (W) and lengths (L). The horizontal bar is where displacement 
has dropped to half its value at the fault. Bottom: Predicted fault-parallel 
static displacements for three buried infinite strike-slip faults extending 
from depth w to depth W, all with the same slip. (Mavko, 1981. 
Reproduced with the permission of Annual Reviews Inc.) 

show that the displacement tapers off rapidly past the fault 
ends (Fig. 4.5-6, top). In addition, there is some fault-normal 
(y direction) motion. For finite faults (Fig. 4.5-6, center ), the 
decay of fault-parallel displacement perpendicular from the 
fault (in the y direction) depends somewhat on the ratio of 
the fault width to fault length, W/L . Thus the fault width estim¬ 
ated from the decay depends on the assumed length. 
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Fig. 4.5-7 Vertical component of static displacement as a function of distance from various pure dip-slip faults. (Yeats et aL 1997: after Stein and Yeats 
1989. Courtesy of H.Iken.) 


If a fault is buried and extends from depth w to depth W, 
Eqn 4 becomes 

u(y) = (D/tt)[ tan" 1 (ylw) -tan” 1 (ylW)]. (5) 

In this case, the maximum surface displacement is less than 
half the fault slip and occurs a distance from the fault equal 
to the mean depth {wW) 1/2 (Fig. 4.5-6, bottom). Thus the 
displacement fields of buried faults are smoother and lower- 
amplitude versions of those for faults that reach the surface. 
These differences occur because a buried fault is further away 
from each point on the surface, and the higher spatial frequen¬ 
cies (shorter wavelengths) in the displacement decay faster with 
distance, making the displacement smoother. As a result, there 
is a trade-off between the fault’s down-dip dimension W - w 
and the coseismic slip D , and one is often assumed to determine 
the other. Often, fault dimensions are estimated from the after¬ 
shock zone. 

The buried fault solution (Eqn 5) is derived by simply add¬ 
ing to Eqn 4 a fictitious second fault extending from the surface 
to the fault top w , with the same slip but in the opposite direc¬ 
tion. This is an example of the general principle that we can 
superimpose static solutions for simple geometries to obtain 
the solution for a complicated geometry. We also do this for 
the propagating waves from complex faults, as we will see 
shortly. The solutions can be added because they satisfy linear 
elasticity. 

Solutions are also available for dip-slip faults. Figure 4.5-7 
shows solutions for the vertical component of static displace¬ 
ment as a function of distance from various pure dip-slip faults. 
For vertical dip, the solution looks like the strike-slip solution 
turned vertically. If the dip is not vertical, the displacement 
varies in magnitude as well as sign across the fault. The higher 
amplitudes are above the thrust fault, on the hanging wall 
block. Interestingly, seismic wave amplitudes for this geometry 
are also often highest on the hanging wall, and can cause sig¬ 
nificant damage when such earthquakes occur under populated 


areas. For a fault that does not reach the surface, the displace¬ 
ment is both reduced in amplitude and varies more smoothly 
with distance than it would for a fault extending to the surface. 
Such buried dip-slip faults are sometimes called “blind” faults, 
because they do not appear at the surface and may not be 
recognized until an earthquake occurs. 

The general solutions allow modeling of all three compon¬ 
ents of static displacement for earthquakes with any focal 
mechanism and finite fault dimensions. We can also model 
situations in which different parts of the fault slip by different 
amounts. 

Estimating fault parameters from geodetic data is a classic 
example of an inverse problem with a highly non-unique 
solution, because various combinations of fault parameters 
predict similar deformation. Figure 4.5-8 shows six solutions 
that all give reasonable fits to the Tango earthquake data 
(Fig. 4.5-4). Model I is an infinite fault with uniform slip at 
depth, model II is an infinite fault with slip tapering to zero at 
depth, and models III and IV are finite faults with uniform and 
variable slip, respectively. Model V is the most complicated, 
in that it assumes that the material near the fault is weaker 
than that further away. 

4.53 Joint geodetic and seismological earthquake studies 

Combining geodetic and seismic wave observations gives more 
information than either data type alone. The two data types are 
nicely complementary. For example, although seismic waves 
have an ambiguity in distinguishing between the fault plane 
and the auxiliary plane, the geodetic data do not, as shown by 
the fact that the Tango earthquake data (Fig. 4.5-4) and static 
displacement models (Fig. 4.5-6, top) do not have a nodal plane 
perpendicular to the fault plane. Both data types can give good 
constraints on the fault geometry and slip on it, and aftershock 
locations often provide the best constraint on fault dimensions. 
However, geodetic data that depend on the difference in posi¬ 
tion before and after an earthquake provide no information 
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Fig. 4.5-8 Comparison of different fault models that predict coseismic 
deformation similar to that observed for the Tango earthquake (Fig. 4.5- 
4). Distance is perpendicular to the fault. The data are normalized by the 
fault offset, and points from the SW side (closed dots) are multiplied by -1 
and plotted with points from the NE side (open dots). (Mavko, 1981. 
Reproduced with the permission of Annual Reviews, Inc.) 

about what happened during the earthquake, whereas seismo- 
logical data can sometimes show how the rupture evolved. 

Figure 4.5-9 illustrates an example of combining geodetic 
and seismological data for the 1994 M s 6.7 Northridge earth¬ 
quake which occurred on a buried thrust fault in the San 
Fernando Valley, near Los Angeles. 5 The focal mechanism 
and aftershock distribution indicate thrust faulting on a 
NW-striking, SW-dipping fault. The geodetic (GPS) data show 
significant vertical and horizontal motions concentrated above 
the buried fault. The directions and magnitudes of the static 
deformation, including the motion of down-dip sites toward 
the fault and the high amplitudes above the fault, are what we 
would expect for this geometry (Fig. 4.5-7). These data can 
be modeled quite well by assuming that about 2.5 m of slip 



- 118.8 - 118.6 - 118.4 - 118.2 

Longitude 



Distance (km) 

Fig. 4.5-9 Geodetic and seismological results for the 1994 Northridge 
earthquake. Top : The horizontal (solid arrows) and vertical (solid bars) 
motions observed by GPS are well matched (dashed arrows and open bars) 
by a fault model derived from these data. Negative uplift is shown by bars 
below the station locations (dots). Bottom : Aftershock locations (dots) 
and geometry of fault models with uniform slip (thick line) and variable 
slip on a longer fault (thin line), both of which fit the data. (After 
Hudnut etal., 1996; Thio and Kanamori, 1996; and Wald etal., 1996. 

© Seismological Society of America. All rights reserved.) 


5 This earthquake, which is one of the most studied owing to the extensive seismo¬ 
logical and geodetic networks in the area, gave rise to some of the highest ground 
accelerations ever recorded. It illustrates that even a moderate magnitude earthquake 
can do considerable damage in a populated area. Although the loss of life (58 deaths) 
was small due to earthquake-resistant construction (Section 1.2.2), the 20 billion dol¬ 
lars in damage makes it the most costly earthquake to date in the USA. 


occurred on a fault plane similar to that which one would infer 
from the aftershocks. Two geodetic solutions are shown, one 
with uniform slip and one with variable slip on a larger fault. 

Because high-quality geodetic and seismological data are 
available, considerable detail about the slip distribution has 
been inferred. Strong motion data from seismometers close to 
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Fig. 4.5-10 Comparison of results of slip 
inversions for the Northridge earthquake 
using various datasets. The fault plane is 
viewed from the southwest and above. The 
epicenter is marked by a star. (Wald et al ., 
1996. © Seismological Society of America. 
Distance along strike (km) All rights reserved.) 


the earthquake are especially valuable because they contain 
high-frequency details about the source time function, and 
thus slip process, which can be lost in teleseismic data due to 
attenuation (Fig. 4.3-10). Figure 4.5-10 shows maps of the slip 
distribution on the fault plane estimated first by inverting the 
strong motion, teleseismic, and geodetic data separately, and 
then by a joint inversion. The seismic inversions extend ana¬ 
lysis like that shown in Fig. 4.3-11, which resolved the source 
time function into sub-events, to locate sub-events on the fault 
plane. Interestingly, the largest slip is not at the epicenter (star). 
The results for the different data types differ because each is 
sensitive to different features of the slip. For example, the geo¬ 
detic data yield a much smoother image than the seismic data, 
which can resolve the rupture process, whereas the GPS data 
sample only its end result. Thus, both waveform datasets yield 
a high-slip region near the fault’s northwest corner. Figure 4.5- 
11 shows the time evolution of the rupture inferred from the 


waveforms. Rupture began at the epicenter and then pro¬ 
pagated up-dip and northwestward. Such models are giving 
our best look to date into the rupture process, and are being 
combined with experimental and theoretical studies of rock 
fracture (Section 5.7) to explore the complex physics of earth¬ 
quake faulting. 

Geodetic data after earthquakes also sometimes show a 
phenomenon called afterslip or postseismic slip, in which 
deformation goes on “silently” (without a seismic signal) for 
some time after an earthquake and its seismologically observed 
aftershocks. For plate boundaries, this motion is sometimes 
thought of as a postseismic portion of the seismic cycle, during 
which the motion slows from the rapid coseismic motion to 
the slower steady interseismic motion. However, as discussed in 
Section 5.7.6, it is often unclear whether the postseismic motion 
reflects continued slip on the earthquake fault, the response 
of the lithosphere to the earthquake having a time-varying 
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Fig. 4.5-11 Time history results for the Northridge earthquake. Rupture 
appears to have begun at the epicenter (star) and then propagated up-dip 
and northwestward. The geometry is the same as in Fig. 4.5-10. (Wald 
etai, 1996. © Seismological Society of America. All rights reserved.) 

viscous component in addition to purely elastic instantaneous 
deformation, or both. 

4.5.4 Interseismic deformation and the seismic cycle 

Geodesy gives insight into the seismic cycle before, after, and 
between earthquakes, whereas we can only study the seismic 


Strike-slip fault 



Distance (km) 

Fig. 4.5-12 Top: Coseismic (heavy solid line), interseismic (dashed line), 
and total or far-held (thin solid line) motions in the fault-parallel {x) 
direction as functions of fault-perpendicular distance (y) for an elastic 
rebound model of the seismic cycle on an infinite, vertically dipping, 
strike-slip fault. Bottom: Interseismic strain for this model. 

waves once an earthquake occurs. To see this, consider a simple 
elastic rebound (Fig. 4.1-3) model of an infinite strike-slip fault 
at a plate boundary, assuming that large earthquakes release 
all the strain which accumulates between earthquakes. After an 
earthquake, material on the right (+y) side far from the fault 
moves at the far-held rate v relative to the left {—y) side of the 
fault, and so has moved a distance vt by time t (Fig. 4.5-12, 
top). However, between earthquakes the fault is locked down 
to depth W, although it slips freely below, so material at the 
fault does not move between earthquakes. When the next large 
earthquake occurs, completing the seismic cycle, everything 
to the right of the fault must have moved a distance vt. The 
earthquake’s coseismic displacement will be given by Eqn 4 
with D = vt, so the coseismic slip u(y) is less than D except at the 
fault. This means that points away from the fault already have 
moved part of the distance D before the earthquake. Similarly, 
everything on the left side must have had no net motion from 
the seismic cycle, even though material near the fault moved 
“backward” (in the -x direction) during the earthquake. 

Thus the fault-parallel interseismic motion s(y) is found by 
subtracting the coseismic slip from the far-held (or net) motion, 
giving 

s{y) = D/2 + {D/n) tan' 1 (y/W). (6) 

Hence, as shown in Fig. 4.5-12 {top), material on the left side 
near the locked fault is “dragged along” during the interseismic 
period, and then rebounds during the earthquake. Material on 
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the right side near the fault is retarded during the interseismic 
period, and then “catches up” to the far-held motion due to the 
coseismic deformation. Equations 4 and 6 are thus mathemat¬ 
ical formulations of the elastic rebound model in Fig. 4.1-3. 

If the fault is a plate boundary, the interseismic deformation 
occurs over a finite plate boundary zone within which sites on 
either side of the boundary move relative to the interior of the 
plate they are on. In this case, the boundary zone is relatively 
narrow, comparable to the depth to which the fault is locked. 
However, as we will see, many plate boundary zones are 
broader because additional faults take up some of the plate 
motion. 

Because the interseismic motion is the difference between the 
far-held motion and coseismic deformation, its variation with 
distance from the fault depends on the locking depth and far- 
held rate. Comparison with the coseismic slip shows that the 
width of the zone across which the motion changes rapidly 
depends on the locking depth. Shallow locking concentrates 
interseismic slip near the fault, whereas deeper locking spreads 
it out into a broad shear zone. Hence a series of geodetic 
surveys can develop a velocity prohle across the fault, which 
we can interpret by setting D = vt in Eqn 6 and dividing the 
change in positions between surveys by the time between them. 
Figure 4.5-13 shows a prohle across the much-photographed 
{Fig. 4.1-1) Carrizo Plain segment of the San Andreas fault. 
The data are reasonably well ht by a far-held rate of about 
35 mm/yr. As we will discuss in the next chapter, this rate is 
less than the total (approximately 45 mm/yr) motion between 
the Pacihc and North American plates, showing that some of 
the plate motion occurs away from the San Andreas fault over a 
broader plate boundary zone. In fact, we will see that space 
geodetic profiles across the broad boundary zone, which con- 
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Fig. 4.5-13 GPS data showing fault-parallel horizontal interseismic 
motion across the Carrizo Plain segment of the San Andreas fault. 
(Z.-K. Shen, personal communication, 2000.) 


tains many faults, look generally like Fig. 4.5-12 {top) but with 
the full relative plate velocity. 

We can use Eqn 6 to hnd the interseismic shear strain rate 

• - .1 ^ s (y) v i 

xy ~2 dy ~ 2rcW [1 + (y/W) 2 ]' (7) 

As shown in Fig. 4.5-12 (bottom), strain accumulates near the 
fault during the interseismic period and is released in large 
earthquakes. Fike the displacement, the variation of strain with 
distance from the fault depends on the locking depth and 
far-held rate. The strain rate can be inferred from changes in 
the angles between geodetic markers. Thus, prior to the advent 
of GPS, which made studying displacements much easier, many 
fault geodesy studies used triangulation to study interseismic 
strain accumulation rates. 

Although this example is shown for a strike-slip fault (the 
easiest to draw), a similar approach is used for thrust faults at 
subduction zones (Fig. 4.5-14). The interseismic motion is 
modeled as the difference between long-term plate motion and 
the coseismic deformation in large plate boundary earthquakes 
(e.g., Fig. 4.5-7). As for the strike-slip case, interseismic motion 
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Fig. 4.5-14 Top: Two stages in the earthquake cycle at a subduction zone. 
Bottom: Predicted interseismic vertical motion due to a locked fault at a 
subduction zone. The vertical motion is normalized by the locked plate 
convergence rate, and the horizontal distance is normalized by the distance 
between the trench and end of the locked fault. (Savage, 1983./. Geophys. 
Res., 88, 4984-96, copyright by the American Geophysical Union.) 
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Fig. 4.5-15 GPS velocities relative to North 
America for some sites near the rupture zone 
of the great 1964 Alaska earthquake. The 
eastern sites move in the plate convergence 
direction, as expected for interseismic 
motion, whereas sites to the west move in 
the opposite direction, implying postseismic 
motion. (Freymueller etal., 2000./. 
Geopbys. Res., 105 , 8079-101, copyright 
by the American Geophysical Union.) 
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occurs in a boundary zone extending some distance from the 
fault defining the nominal plate boundary. Modeling predicts 
interseismic subsidence and landward motion for most sites 
above the locked fault, and uplift further inland (Fig. 4.5-14, 
bottom ). The motion has largely decayed by a distance equal to 
twice that between the trench and the locked fault end. 

Thus geodetic data near trenches can identify the inter¬ 
seismic deformation and provide insight into the mechanics of 


the subduction interface and future large earthquakes on it. 
Figure 4.5-15 shows GPS velocities relative to the stable inter¬ 
ior of North America for some sites near the rupture zone of 
the great 1964 Alaska earthquake (Fig. 4.3-15). Sites to the east 
of the area shown move northwest, in the direction of Pacific 
plate subduction beneath North America, as we would expect 
for the interseismic motion of sites on the overriding plate 
above a locked fault. The motion decays rapidly landward 
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Fig. 4.5-16 Profiles of horizontal {top) and 
vertical ( center) GPS velocities relative to 
North America for eastern sites in Fig. 4.5- 
15. The data are reasonably similar to 
predictions (solid line) for a locked fault 
model [bottom). Note that uncertainties for 
the vertical GPS data are larger than for the 
horizontal data. (Freymueller etal., 2000./. 
50 100 150 200 250 300 350 400 450 500 550 600 Geophys. Res., 105, 8079-101, copyright 

Distance from trench (km) by the American Geophysical Union.) 



with distance from the trench. These observations, together 
with the observed uplift, are reasonably consistent with the 
expected interseismic motion (Fig. 4.5-16). Piowever, sites to 
the west move in the opposite direction, toward the trench, and 
so appear instead to show continuing postseismic motion. The 
differences between the two regions may reflect the complex 
slip history in the great earthquake or long-term differences in 
the behavior of different parts of the plate interface. 

Flence, in general, geodetic data from the interseismic period 
give insight into the mechanics of a fault and future earth¬ 
quakes on it, even before they occur. This is gratifying because 
the seismic cycle is so long, typically hundreds of years, that we 
generally have to wait a long time to study a major earthquake 
on a given fault segment. A slight compensation is that, as 
we wait, estimates of geodetic velocities improve. Consider 
measuring the rate v of motion of a monument that started at 
position x 1 and reaches x 2 in time T. If the position uncertainty 
is given by its standard deviation ct, then the propagation of 
errors relation (Eqn 6.5.18) discussed in Chapter 6 shows that 

v={x 1 -x 2 )IT implies cr,= ^2(7/T, (8) 

where o v is the uncertainty of the inferred rate. Thus the longer 
we wait, the smaller the velocity uncertainty becomes, even if 


the data do not become more precise. Hence, older geodetic 
data — for example, those taken shortly after the 1906 San 
Francisco earthquake — can be of great value even if their 
errors are larger than those of more modern data. 

The geodetic data let us see the rate at which locked slip is 
accumulating, and hence infer the maximum possible slip in a 
future earthquake, depending on when it occurs. Conversely, 
we can estimate the time until a future earthquake from records 
of past earthquakes, by assuming what the coseismic slip will 
be. However, as we noted in Section 1.2 and will discuss fur¬ 
ther, the large earthquakes are variable enough that attempts to 
predict them by approaches like this have not been successful. 

In some places, geodetic data imply that slip is accumulat¬ 
ing on the locked fault at a rate less than the far-field motion. 
For the San Andreas example shown, this difference seems to 
be due to plate motion taken up elsewhere. In other places, 
the difference is thought to indicate that some of the plate 
boundary slip occurs by aseismic slip or sliding (perhaps as 
“silent earthquakes”) on the fault, and hence will not appear in 
future earthquakes. As discussed in the next chapter, the idea 
that significant portions of the motion on many plate boundar¬ 
ies occurs aseismically is also suggested by earthquake history 
studies. Such aseismic fault creep has been observed geodetic- 
ally in some areas. 
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The area of locking has interesting implications. Because 
an earthquake’s seismic moment is the product of the fault 
area, coseismic slip, and rigidity, the fault width and rate at 
which slip accumulates give insight into the maximum seismic 
moment that the locked fault could release in a future earth¬ 
quake. The San Andreas data (Fig. 4.5-13) indicate that the 
vertically dipping fault is locked to a depth of about 20 km, 
which is similar to the maximum depth of small earthquakes 
and the inferred lower extent of rupture in large earthquakes 
along the fault. As discussed in Section 5.7, this depth is gen¬ 
erally consistent with studies of rock strength and friction, 
which imply that rocks deeper than about 20 km are weak 
and undergo stable sliding rather than accumulate elastic strain 
for future earthquakes. The Alaska situation is quite different 
because the plate interface has a shallow dip (Fig. 4.5-16), 
so there is a large fault area at depths shallow enough to 
accumulate strain and then rupture. Hence, as we will see in 
Section 4.6, the largest earthquakes occur at shallow-dipping 
subduction zones and are much bigger than those for transform 
boundaries. In either environment, however, it is not clear 
whether the entire locked region contributes to the seismic slip 
or whether part of the fault slips rapidly in the earthquake 
and another part contributes to aseismic afterslip. 

To complicate matters even further, it is worth bearing in 
mind that we still do not have good geodetic data spanning 
even one full seismic cycle, much less such data combined with 
detailed studies of earthquakes at either end. Hence we have 
little insight into the different possible time-variable effects like 
afterslip or the transient effects due to earthquakes on nearby 
faults or other segments of the same fault. Thus it may be quite 
some time before many of these issues are resolved. 

An intuitive way to summarize some of these ideas is to think 
of the seismic cycle as a fault’s “slip budget,” analogous to per¬ 
sonal finances. Given our income (plate motion), we spend 
some immediately (aseismic slip) and save some (locked slip). 
The savings are used for major purchases (earthquakes) at a 
rate depending on the price of individual purchases (coseismic 
slip), expenses associated with these major purchases (post- 
seismic slip), and our saving rate (locked slip). Thus, although 
we can estimate roughly when we might make a future large 
purchase, the actual date depends on unpredictable changes in 
the price (variable earthquake size) and changes in our savings 
beyond our steady income and regular expenses, due to gifts or 
unanticipated expenses (effects of other earthquakes). Thus 
even in this simple analogy the earthquake cycle is complicated. 

4.6 Source parameters 

4.6 .1 Magnitudes and moment 

So far in this chapter we have discussed using seismic waves 
radiated by earthquakes to study their source geometry and 
focal depth. While recognizing the limitations on what the 
seismic waves can tell us about the actual source process, we 


have seen that for most earthquakes, assuming a simple fault 
geometry and source model allows us to estimate parameters 
that are generally consistent with other data and our geological 
instincts. We thus proceed further in using seismic waves to 
learn more about the faulting process. 

In fact, even before earthquake mechanisms were studied, 
seismologists’ second need after learning to locate earthquakes 
was to quantify their size, both for scientific purposes and to 
discuss their effects on society. The first measure introduced 
was the magnitude , which is based on the amplitude of the 
resulting waves recorded on a seismogram. The concept is 
that the wave amplitude reflects the earthquake size once the 
amplitudes are corrected for the decrease with distance due to 
geometric spreading and attenuation. Magnitude scales thus 
have the general form 

M = log {AIT) + F{h, A) + C, (1) 

where A is the amplitude of the signal, T is its dominant 
period, F is a correction for the variation of amplitude with the 
earthquake’s depth h and distance A from the seismometer, and 
C is a regional scale factor. 1 Magnitude scales are thus logar¬ 
ithmic, so an increase in one unit, as from magnitude “5” to 
“6,” indicates a ten-fold increase in seismic wave amplitude. 
Measured magnitudes range more than 10 units 2 because the 
displacements measured by seismometers span more than a 
factor of 10 10 . 

The earliest magnitude scale, introduced by Charles Richter 
in 1935 for southern California earthquakes, is the local mag¬ 
nitude , M l , often referred to as the “Richter scale.” Figure 4.6-1 
shows how M l is determined from the amplitude measured on 
a specific seismograph, known as the Wood-Anderson seismo¬ 
graph. The magnitude of the largest arrival (often the S wave) 
is measured and corrected for the distance between the source 
and the receiver, given by the difference in the arrival times of 
the P and S waves. The scale 

M l = log A + 2.76 log A - 2.48, (2) 

defined for earthquakes in southern California, is a form of 
Eqn 1 with the instrument period (0.8 s) and nearly constant 
(shallow) depth incorporated in the constants, and the distance 
in km. Richter magnitudes in their original form are no longer 
used because most earthquakes do not occur in California 
and Wood-Anderson seismographs are rare. However, local 
magnitudes are sometimes still reported because many build¬ 
ings have resonant frequencies near 1 Hz, close to that of a 
Wood-Anderson seismograph, so M L is often a good indica¬ 
tion of the structural damage an earthquake can cause. 

With time, various local and global magnitude scales 
evolved. For global studies, the primary two were the body 

1 We use the notations “log” for log 10 , and “In” for the natural log e . 

2 Magnitudes can be negative for very small displacements; a magnitude -1 earth¬ 
quake might correspond to a hammer blow. 
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Fig. 4.6-1 The Richter scale for local magnitude, M L . 

- 0.1 The magnitude is found from the amplitude of the largest 

Amplitude (mm) arr i va l an h the S - P travel time difference. In this example, 

1 the maximum amplitude is 23 mm and the S - P time is 

24 s, making M L = 5.0. (From Earthquakes by Bruce A. 
Bolt © 1978,1988,1993 by W. H. Freeman and Co. 

Used by permission.) 


wave magnitude , m b , and surface wave magnitude , M s . m b is 
measured from the early portion of the body wave train, 
usually the P wave, using 

m b ~ log (AIT) + Q{h, A), (3) 

where A is the ground motion amplitude in microns after the 
effects of the seismometer are removed, T is the wave period in 
seconds, and Q is an empirical term depending on the distance 
and focal depth. This function can be derived either as a global 
average or for a specific region, as shown by Fig. 4.6-2. Meas¬ 
urements of m b depend on the seismometer used and the por¬ 
tion of the wave train measured. Common US practice uses the 
first 5 s of the record and periods less than 3 s, usually about 
1 s, on instruments with peak response near 1 s. m b is measured 
out to 100° distance, beyond which diffraction around the core 
has a complicated effect on the amplitude. 

The surface wave magnitude, M 5 , is measured using the 
largest amplitude (zero to peak) of the surface waves 

M s = log {AIT) + 1.66 log A + 3.3 or 

M s = log A 20 + 1.66 log A+ 2.0, (4) 


where the first form is general, and the second uses the ampli¬ 
tudes of Rayleigh waves with a period of 20 s, which often have 
the largest amplitudes. In these relations, A is the ground mo¬ 
tion amplitude in microns after the effects of the seismometer 
are removed, T is the wave period in seconds, and the distance 
A is in degrees. 

As measures of earthquake size, magnitudes have two major 
advantages. First, they are directly measured from seismo¬ 
grams without sophisticated signal processing. Second, they 
yield units of order 1 which are intuitively attractive: magni¬ 
tude 5 earthquakes are moderate, magnitude 6 are strong, 7 are 
major, and 8 are great. 

However, magnitudes have two related limitations. First, 
they are totally empirical and thus have no direct connection 
to the physics of earthquakes. A striking illustration of this is 
that Eqns 1-4 are not even dimensionally correct — logarithms 
can be taken only for dimensionless quantities, whereas these 
expressions involve ratios of displacement to period. A sec¬ 
ond difficulty is with the numbers that emerge. Magnitude 
estimates vary noticeably with azimuth, due to the amplitude 
radiation patterns (Section 4.3), although this difficulty can be 
reduced by averaging results. The different magnitude scales 
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Fig. 4.6-2 Q factor for body wave 
magnitude m b derived for P waves 
from earthquakes in the Tonga region 
recorded by a temporary deployment of 
seismometers. Q depends on focal depth 
and epicentral distance. (Wysession etai, 
1996. © Seismological society of America. 
All rights reserved). 
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Table 4.6-1 Source parameters for selected earthquakes. 


Earthquake 

Body wave 
magnitude, m b 

Surface wave 
magnitude, M s 

Fault area (km 2 ) 
(length x width) 

Average 
dislocation (m) 

Moment 
(dyn-cm), M 0 

Moment 
magnitude, M w 

Truckee, 1966 

5.4 

5.9 

10x10 

0.3 

8.3 xIO 24 

5.9 

San Fernando, 1971 

6.2 

6.6 

20x14 

1.4 

1.2 xIO 26 

6.7 

Loma Prieta, 1989 

6.2 

7.1 

40 x 15 

1.7 

3.0 xIO 26 

6.9 

San Francisco, 1906 


7.8 

450x10 

4 

5.4 xIO 27 

7.8 

Alaska, 1964 

6.2 

8.4 

500 x 300 

7 

5.2 xIO 29 

9.1 

Chile, 1960 


8.3 

800 x 200 

21 

2.4 x 10 30 

9.5 


Sources: Values from Geller (1976), Wallace etal. (1991), and Wald et al. (1993). 


yield different values. Moreover, body and surface wave magni¬ 
tudes do not correctly reflect the size of large earthquakes. 

The latter two effects are illustrated in Table 4.6-1, which 
gives magnitudes for various earthquakes, ordered by increas¬ 
ing scalar moment. 3 As shown, m b and M s differ significantly. 
The earthquakes with moments greater than that of the San 
Fernando earthquake all have m b 6.2, even as the moment 
increases by a factor of 20,000. Similarly, the earthquakes 
larger than the San Francisco earthquake have M s about 8.3, 
even as the moment increases by a factor of 400. This effect, 
called magnitude saturation , is a general phenomenon for m b 
above about 6.2 and M s above about 8.3. 

Earthquake source parameter data like those in Table 4.6-1, 
some of which are shown in Fig. 4.6-3, are used to investigate 
issues related to earthquake size. Before doing so, it is worth 
briefly discussing how the tectonic setting affects earthquake 

3 Seismic moments are reported either in dyn-cm or N-m, with 1 N-m = 10 7 dyn-cm. 


size. All of these earthquakes, except for Chile, reflect deforma¬ 
tion in the broad boundary zone between the North American 
and Pacific plates (Fig. 5.2-3). The San Fernando earthquake 
occurred on a buried thrust fault in the Los Angeles area, sim¬ 
ilar to the Northridge earthquake (Figs 4.5-9 and 4.5-10). These 
relatively short faults are part of an oblique trend in the bound¬ 
ary zone, so the fault areas tend to be roughly rectangular. 
Their down-dip width seems controlled by the fact that rocks 
deeper than about 20 km are weak and undergo stable sliding 
rather than accumulating elastic strain for future earthquakes, 
as discussed in the context of fault locking (Section 4.5.4). 
The next largest earthquake, Loma Prieta, occurred either close 
to or on a short segment of the San Andreas fault (Fig. 1.2-16), 
and hence on a somewhat longer fault of comparable width. 
The San Francisco earthquake ruptured a long segment of 
the San Andreas fault with significantly larger slip, but because 
the fault is vertical, still had a narrow width. Thus the 1906 
earthquake illustrates approximately the maximum size of 
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San 

Fernando, 

1971 

M 0 = 1.2x10 26 

M s - 6.6 
Slip = 1.4 m 


San 

Francisco, 

1906 

M 0 = 5.4 x 10 27 

M s = 7.8 
Slip = 4m 


□ 


Alaska, 1964 



Chile, 1960 


M q = 

2.4 x 10 30 
M s = 8.3 
Slip = 21 m 


Fig. 4.6-3 Comparison of moment, magnitudes, fault area, and fault slip 
for four earthquakes listed in Table 4.6-1. M s saturates for events with 
M w > 8 and so is no longer a useful measure of earthquake size. 


continental transform earthquakes. However, the Alaska and 
Chilean earthquakes had much larger rupture areas because 
they occurred on shallow-dipping subduction thrust inter¬ 
faces. As shown in Fig. 4.5-16, these faults can have widths of 
hundreds of km on which elastic strain can build up and even¬ 
tually be released seismically. As will be discussed shortly, the 
larger fault dimensions give rise to greater slip, so the combined 
effects of larger fault area and more slip cause the largest earth¬ 
quakes to occur at subduction zones rather than on transforms. 

It is important to realize that values like those in Table 4.6-1 
are estimates with considerable uncertainties due to various 
causes. First, there are uncertainties due to the earth’s vari¬ 
ability and deviations from the mathematical simplifications 
used. For example, even with high-quality modem data, seis¬ 
mic moment estimates for the Loma Prieta earthquake vary 
by about 25%, and M s values vary by about 0.2 units. Second, 
the estimation techniques vary. The actual approaches used 
to compute magnitudes have changed with time (note that the 
pre-1964 earthquakes do not have m b values) in various ways. 
Uncertainties for historic earthquakes are especially large; for 
example, fault length estimates for the 1906 San Francisco earth¬ 
quake vary from 300 to 500 km, M s has been estimated at 8.3 
but is now thought to be about 7.8, and the fault width is essen¬ 
tially unknown and inferred from the depths of more recent 
earthquakes and geodetic data. Third, different techniques (body 
waves, surface waves, geodesy, geology) can yield different 
estimates. Fourth, the fault dimensions and dislocations shown 
are average values for quantities that can vary significantly 
along the fault (Fig. 4.5-10). As a result, different studies 
yield varying and sometimes inconsistent values, depending 
on which parameters are estimated directly from data, which 


are assumed, and which are inferred by combining others. For 
example, the relation between the seismic moment, slip, and 
fault dimensions depends on the rigidity assumed (typically 3- 
5 x 10 11 dyn/cm 2 for shallow earthquakes). Even so, such data 
are sufficient to show the basic effects of interest. 

We can understand these effects given what we have dis¬ 
cussed about the amplitudes of body and surface waves in 
Sections 4.2 and 4.3 — information that was not available to 
seismologists when these magnitude scales were developed. We 
have seen that the amplitudes depend on the scalar moment, 
the azimuth of a seismometer relative to the fault geometry, the 
distance from the source, and the source depth. Moreover, 
because the source time function has a finite duration, depend¬ 
ing on fault dimensions and rise time, the amplitudes vary 
with frequency. We will see shortly that these frequency vari¬ 
ations explain the differences between magnitudes and their 
saturation. 

Before doing so, we note the simple and elegant solution that 
has been adopted: namely, defining a magnitude scale based on 
the seismic moment. The moment magnitude , 


AT 


^ - 10.73, 
1.5 



defined for A4 0 in dyn-cm, has several advantages. It gives a 
magnitude directly tied to earthquake source processes that 
does not saturate. Moreover, it preserves the simplicity of the 
magnitude scale by giving values of order 1 compatible with 
other magnitude scales. As we will see, M w is comparable to M s 
until M 5 saturates at about 8.2, but then increases. The largest 
seismically recorded earthquake, the 1960 Chile event listed 
in Table 4.6-1, had M w 9.5. Moment magnitude has become 
the common measure of the magnitude of large earthquakes. 
Estimation of M 0 (and therefore M w ) requires more analysis of 
seismograms than for m b or M s . However, semi-automated 
programs like the Harvard CMT project or comparable re¬ 
gional analyses now regularly compute moment magnitudes 
for most earthquakes larger than about M w 5. 


4.6.2 Source spectra and scaling laws 


The relations between the moment and various magnitudes 
arise from the spectrum of the radiated seismic waves. We saw 
in Section 4.3.2 that the radiated waves depend on the product 
of the scalar moment and the source time function generated by 
the earthquake. We used a simple model in which the time 
function was the convolution of two “boxcar” time functions 
due to the finite length of the fault and the finite rise time of the 
faulting at any point. The Fourier transform of the resulting 
time function is the product of the transforms of the boxcars. 

The transform of a boxcar of height 1/T and length T is 


772 


F(co) = 

•j 


— e iat dt 
T 


1 

Tied 


(e 


icoT/2 


_ g-icoT/2 


sin(coT/2) 

coT/2 


( 6 ) 


-77 2 





4.6 Source parameters 267 


This function, sometimes written as sine x = (sin x)/x, appears 
in many applications in which only part of a signal is selected. 
In Section 2.5.10 it described the amplitude resulting when part 
of a plane wave diffracts through a slit. In Section 6.3, we will 
use it to describe the effect on a time series spectrum from using 
only part of the series. Here, the sine function describes the fact 
that the source pulse has finite duration. 

Thus the spectral amplitude of the source signal is the prod¬ 
uct of the seismic moment and two sine terms, 


| A(co) | - M 0 


sm{coT R /2) 

sin{coT D /2) 

cqT r /2 

coT d /2 


(7) 


where T R and T D are the rupture and rise times. Often, we use 
the logarithm of Eqn 7, 


log A{co) = log M 0 + log [sine {coT R /2)\ + log [sine {coT D l 2)]. 

( 8 ) 


A useful approximation is to treat sine x as 1 for x < 1, and 1/x 
for x > 1, as shown in Fig. 4.6-4 {top). In this approximation a 
plot of log | A(cq) | versus log 0) is just three segments, corres¬ 
ponding to different frequency ranges (Fig. 4.6-4, bottom). 
Assuming T R > T D , we have 




Fig. 4.6-4 Top : The approximation to the (sin x)/x function used in 
modeling the source spectrum. Bottom : Theoretical source spectrum of 
an earthquake, modeled as three regions with slopes of 1, or 1 , and or 2 , 
divided by angular frequencies corresponding to the rupture and rise 
times, T r and t d . Another common approximation uses a single corner 
frequency, f c , at the intersection of the first and third spectrum segments. 
The flat segment extending to zero frequency gives M 0 , 


log | A{co) | = 

log M 0 co < 2 /T r 

< log M 0 - log (T r / 2) - log co 2 !T R < co < 2/T d > 

log M 0 - log (T r T d / 4) - 2 log 0 ) 2/T d < w 


(9) 


This plot is divided into three regions by the frequencies 2/T R 
and 2/T d , which are called corner frequencies. The spectrum 
is flat for frequencies less than the first corner, goes as cor 1 be¬ 
tween the corners, and decays as co~ 2 for the high frequencies. 
Thus the spectrum is parametrized by three factors: seismic 
moment, rise time, and rupture time. It is worth noting that 
other source spectral models have been used. A third corner 
frequency can be added to this model, representing the effects 
of fault width and yielding an car 3 segment at high frequency. 
Other models have a single corner frequency (dashed line in 
Fig. 4.6-4, bottom) that combines the effects of rise and rupture 
time. As a result, the interpretation of observed earthquake 
spectra depends somewhat on the source model. 

To see how the source spectrum varies with earthquake size, 
we first note that the seismic moment is the scale factor for the 
spectral amplitude at low frequencies co-* 0. This is the reason 
why it is also called the “static” moment. It is defined (Section 
4.2.3) as the product of the rigidity at the source depth, p, the 
average slip (or dislocation) on the fault, D, and the fault area, 
S. The fault area can be written in terms of a shape factor f and 
the square of a dimension L, so 

M 0 = pDS - pDfL 1 . (10) 

For large earthquakes, faults are often treated as approxim¬ 
ately rectangular, so L is the length, and f is the ratio of width 
to length. Another common approach uses a circular fault 
model for which L is the radius and f= it. 

The rupture time (Eqn 4.3.8) needed for the rupture to 
propagate along the fault is approximately 


T r - Llv R = L/{Q.7p), (11) 

if we assume that the rupture velocity is about 0.7 times the 
shear velocity. The rise time needed for the dislocation to reach 
its full value at any point on the fault has been predicted to be 
about 


T D = pD/(pAo) = 16L/' 1/2 /(7^ 1 - 5 ), (12) 

where Ad is the stress drop in the earthquake, a quantity that 
we will discuss shortly. Assuming a shear velocity of about 
4 km/s, Eqns 11 and 12 yield approximately 

T r = Q.35L, T D -0.1Lf m . (13) 

Table 4.6-1 shows that the Truckee and San Fernando earth¬ 
quakes occurred on approximately square faults (f= 1), Loma 
Prieta and Alaska had L ~ 2 W, or f~ 0.5, and the San Francisco 
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Fig. 4.6-5 Theoretical source spectra of surface and body waves. The two 
are identical at frequencies below the co ~ 2 corner frequency. This model 
includes a fault-width corner frequency, and thus an to" 3 segment at high 
frequency. m b , reflecting the amplitude at 1 s, saturates at about 6 for 
earthquakes with moment above about 10 25 dyn-cm. M s , measured at 
20 s, saturates at about 8 for moments greater than about 5 x 10 27 dyn- 
cm. The x axis is in frequency (Hz) rather than angular frequency (ta). 
(Geller, 1976. © Seismological Society of America. All rights reserved.) 

earthquake occurred on a long narrow fault with L » W, or 
f < 0.1. Thus, in these cases or for a circular fault, T R > T d , as 
drawn in Fig. 4.6-4. 

As we will see, earthquake stress drops are approximately 
independent of seismic moment, implying that the slip is pro¬ 
portional to fault length. Hence, for an assumed stress drop, we 
can compute theoretical spectra for various moments and fault 
lengths (Fig. 4.6-5). The results show why m b and M 5 differ, 
and why both magnitude scales saturate. As the fault length 
increases, the seismic moment, rupture time, and rise time 
increase. Thus the corner frequencies move to the left, to lower 
frequencies. The moment, M 0 , determines the zero-frequency 
level, which rises as the earthquake becomes larger. However, 
the surface wave magnitude, M 5 , is measured at a period of 
20 s, and so depends on the spectral amplitude at this period. 
For earthquakes with moments less than about 10 26 dyn-cm, 
a 20 s period corresponds to the flat part of the spectrum, so 
M s increases with moment. However, for larger moments, 20 s 
is to the right of the first corner frequency, so M 5 does not 
increase at the same rate as the moment. Once the moment 
exceeds about 5 x 10 27 dyn-cm, 20 s is to the right of the second 
corner, on the co~ 2 portion of the spectrum. Thus M s saturates 
at about 8.2, even if the moment increases. A similar effect 
occurs for body wave magnitude, which depends on the 
amplitude at a period of 1 s. Because this period is shorter than 
the 20 s used for M s , m b saturates at a lower moment (about 
10 25 dyn-cm), and remains about 6 even for much larger earth¬ 
quakes. Similar saturation effects occur for other magnitude 
scales which are measured at specific frequencies. 


Fig. 4.6-6 Plots of AT versus log M 0 and M s versus log of fault area (5) 
show the saturation of surface wave magnitude. M s saturates even as the 
moment and fault areas increase. Lines show the predictions of the scaling 
relations in Table 4.6-2. Open and closed circles denote intraplate and 
interplate earthquakes, respectively. (Geller, 1976. © Seismological 
Society of America. All rights reserved.) 

Another way to view magnitude saturation is shown by the 
data for various earthquakes in Fig, 4.6-6. For earthquakes 
above about 10 28 dyn-cm, M s saturates even for progressively 
larger fault areas and thus seismic moments. As a result, M s 
is not a useful measure of the size of very large earthquakes. 
For this reason, moments or moment magnitudes are used to 
describe large earthquakes. 

These effects are described by theoretical scaling relations 
between various source parameters. Figure 4.6-6 shows that 
the scaling relations used to generate Fig. 4.6-5 describe the 
data relatively well, given the uncertainties in the data and 
the simplifying assumptions required to derive these relations. 
Table 4.6-2 presents these scaling relations and one relating 
m b and M s . Although the specific numerical values in these 
relations are approximations, the general trends in the data are 
relatively well described, so scaling relations provide powerful 
tools. They provide valuable insight into the relation between 
source parameters and are used to estimate source parameters 
for earthquakes that have not yet occurred, or for which 
parameters of interest are unknown. 

Another approach to some of these issues uses empirical 
regression relations between source parameters compiled for 
many earthquakes, as illustrated in Fig. 4.6-7. Although these 
relations do not allow us to explore theoretical relationships 
between parameters, such as magnitude saturation, they offer 
useful inferences about past and potential earthquakes. For 
example, these regressions imply that an earthquake on a 
100 km-long fault would have an average slip of about 2 m and 
M w about 7.4, whereas on a 10 km-long fault we expect about 
0.3 m slip and M w about 6.2. As for the scaling laws, these 
estimates should be taken as useful averages. For example, we 
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o strike slip 


Fig. 4.6-7 Empirical relations showing the 
average slip, fault length, and moment 
magnitude for a compilation of 
earthquakes. (Wells and Coppersmith, 
1994. © Seismological Society of America. 
All rights reserved.) 




Surface rupture length (km) 


Surface rupture length (km) 


Table 4,6-2 Earthquake scaling relations. 

m b and M s are related by 
m*=M,+ 1.33 


M s < 2.86 
2.86 <M S < 4.90 
4.90 <M S < 6.27 
6.27 < /W. 


Assuming L = 2W, M and fault area (in km 2 ) are related by 


log S = 0.67M s -2.28 
log 5 = M 5 -4.53 
log 5 = 2M S - 12.65 
M =8.22 


M s < 6.76 
6.76 < M s < 8.12 
8.12 <M S < 8.22 
S > 6080 km 2 . 


Assuming a stress drop of 50 bars, log M 0 (in dyn-cm) and M s are 
related by 

log M 0 = M S + 18.89 M s < 6.76 

log M 0 =1.5M S + 15.51 6.76</W s <8.12 


3/11 + 3.33 


Source: Geller (1976). 


8.12 < M S < 8.22 
log M 0 > 28, 


du x D 

(14) 

ox L 

so the stress drop averaged over the fault is approximately 

A <j~pD/L. (15) 

From seismological observations alone, the best-constrained 
quantity is the seismic moment, so we estimate the average slip, 
D, from the seismic moment as 

D ~ cMqI(/xL 2 ), (16) 

where c is a factor depending on the fault’s shape. Thus the 
stress drop is proportional to the moment and inversely pro¬ 
portional to the fault dimension cubed or the 3/2 power of the 
fault area: 


Act- cM 0 /L 3 — cM 0 IS 311 . 


would be surprised by 10 m of motion on a 100 km-long fault, 
but not by 1 or 4 m. 


The specific relation and values of c depend on the fault shape 
and the rupture direction. For example, the stress drop on a 
circular fault with a radius R is 


4.6.3 Stress drop and earthquake energy 

The relationship between the slip in an earthquake, its fault 
dimensions, and its seismic moment is closely tied to the magni¬ 
tude of the stress released by the earthquake, or stress drop. 
As discussed in Section 4.5.4, the earthquake releases the strain 
that has accumulated over time near the fault, so the radiated 
seismic waves are used to estimate the stress change. 

To do this, we assume that the earthquake’s slip, D, occurs 
on a fault with characteristic dimension L, and so causes a 
strain change of approximately 


strike-slip on a rectangular fault with length L and width w 
yields 


a 2 Mo 

Acr = ~~T7’ 
n w L 


and dip-slip on a rectangular fault gives 
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Fig. 4.6-8 Amplitude spectrum averaged from P waves recorded at 
globally distributed broadband seismometers for the October 21,1995, 
earthquake near Chiapas, Mexico. (Rebollar etai, 1999. © Seismological 
Society of America. All rights reserved.) 

Ag = 4(A + A() (20) 

7i{X + 2 jj) w 2 L 3 k uj 2 L 

where the last form assumes X = ji. 

These equations let us estimate the stress drop from an 
observed seismic moment and inferred fault dimensions. If 
we know the fault dimensions from other observations, this 
process is straightforward. For example, the fault area of 
the great 1964 Alaska earthquake can be estimated from the 
aftershock area, the source finiteness shown by surface waves 
(Fig. 4.3-15), and geodetic data. Thus using the values in Table 
4.6-1 and Eqn 20 with X - ji yields an average stress drop 
estimate 

. 8 5.2 x 10 29 dyn-cm . n7 , . ? 

Ad =---~ 10 7 dyn/cm 2 = 10 bars. 

3k 9 x 10 14 cm 2 5 x 10 7 cm 

( 21 ) 

Flowever, without independent knowledge of fault dimen¬ 
sions, estimating the stress drop is harder. One approach uses 
the spectrum to identify corner frequencies and estimate the rup¬ 
ture time and hence fault dimensions. Figure 4.6-8 illustrates 
this for a M w 7.1 earthquake occurring at 165 km depth in the 
subduction zone beneath Mexico. Analysis of the spectrum 
with a single corner frequency model like that in Fig. 4.6-4, 
and assuming a circular fault with rupture velocity of 3 km/s, 
yielded a rupture duration of 22 s and a stress drop of about 
65 bars. 4 The low-frequency portion of the spectrum yields a 

4 Stress drops are reported either in bars (1 bar = 10 6 dyn/cm 2 ) or MegaPascals 
(1 MPa = 10 bars). 



Fig. 4.6-9 Theoretical spectra for a shallow (about 8 km focal depth) 
earthquake at several stations. Due to free surface effects, the spectra differ 
from theoretical source spectra like that in Fig. 4.6-4. (Langston, 1978. 

/. Geophys. Res., 83 ,3422-6, copyright by the American Geophysical 
Union.) 


moment of 5.2 x 10 26 dyn-cm, in reasonable agreement with 
other studies, which found 4.6 and 7.1 x 10 26 dyn-cm. 

In many cases the spectrum is not directly amenable to cor¬ 
ner frequency analyses. The earthquake in Fig, 4.6-8 was deep 
enough that the spectrum of the direct P wave could be found 
without contamination from later-arriving surface reflections. 
However, for shallow earthquakes, P, pP, and sP often overlap 
(Fig. 4.3-7), yielding a combined spectrum quite different from 
the source pulse. Figure 4.6-9 illustrates this effect for a shallow 
earthquake. As shown, the spectra differ significantly between 
stations, due to the variation in amplitude between direct and 
reflected arrivals, and cannot be used to find corner frequencies 
or the seismic moment. This difficulty can be addressed by 
modeling the body waves, including the free surface reflections, 
and estimating the source time function duration by matching 
the observed waveforms. Given a duration estimate and an 
assumed fault geometry, the fault length and stress drop are 
estimated in the same way as in the corner frequency analysis. 

These examples illustrate that estimating the fault dimen¬ 
sions and stress drop is challenging, whether it is done in the 
time domain, by modeling or inverting waveforms, or in the 
frequency domain. First, the parameter required is estimated 
only with modest precision, as shown by the issue of choosing 
the corner frequency even with high-quality data like that in 
Fig. 4.6-8. The uncertainty is compounded by the fact that in¬ 
ferring a source dimension from the corner frequency or source 
time function requires assuming the rupture velocity and 
fault geometry. Moreover, the estimated stress drop depends 
on 1/L 3 , so uncertainty in the fault dimension causes a large 
uncertainty in Act. Figure 4.6-10 illustrates this issue via syn¬ 
thetic P waves for different source time function durations. As 
shown, the seismogram depends only moderately on the source 
time function. However, small differences in time function 
duration correspond to larger differences in stress drop, even 
for an assumed rupture velocity and fault geometry. 
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Fig. 4.6-10 Synthetic seismograms for different source time functions and 
corresponding inferred stress drops. A small uncertainty in source time 
function duration results in a large uncertainty in stress drop. (Stein and 
Kroeger, 1980. Reproduced with the permission of the American Society 
of Mechanical Engineers.) 


This brings up the interesting question of how the un¬ 
certainty in a quantity like stress drop, which is inferred by 
combining parameters estimated from data using model as¬ 
sumptions, is related to the uncertainties involved in each step. 
A common approach, derived in Section 6.5.1, uses the pro¬ 
pagation of errors relation (Eqn 6.5.19), which involves the 
partial derivatives of the parameters going into the final quan¬ 
tity, To use this, we write the stress drop (Eqn 17) with the fault 
dimension equal to the product of rupture velocity and rupture 
time, 


Acr= f{c, M 0 , v Rf T r ) = cM 0 I{v r T r ) 3 . (22) 

The standard deviation, or uncertainty, in the stress drop is 
thus approximately 
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We can compute the partial derivatives and use estimates of 
the parameters and their uncertainties to estimate the resulting 
uncertainty in the stress drop. For example, we might assume 
the seismic moment, rupture time, and rupture velocity are un¬ 
certain to about 25%. As Eqns 18-20 show, different models 
give different shape factors, and various methods are used to 
interpret corner frequencies, so c is uncertain to at least 50% if 
we have no other knowledge of the fault geometry. Depending 
on the values used, it seems that the precision of a stress drop 
estimate is often a factor of 2 or 3. The accuracy — how this 
value is related to the physical process of faulting — is hard to 
assess, because the form of the time function and its relation to 
other source parameters are derived for simple source models, 
which may or may not describe real faulting very well. Hence 
the stress drop is more usefully viewed as a characterization 
of the source spectrum than as giving direct insight into the 
physics of the source. 



Fig. 4.6-11 Stress drops for interplate (plate boundary) and intraplate 
(plate interior) earthquakes. The earthquakes are plotted in terms of log 
fault area and log seismic moment. The lines for constant stress drop 
values have slopes of 2/3, as shown by Eqn 17. Most earthquakes have 
Act= 10-100 bars, with intraplate and interplate events trending toward 
the higher and lower ends of the range, respectively. (Kanamori and 
Anderson, 1975. © Seismological Society of America. All rights reserved.) 


With all these difficulties, it is encouraging that earthquake 
stress drop studies typically yield values in the 10-100s of bars, 
as shown in Fig. 4.6-11. The stress drop is essentially constant 
over five orders of magnitude in moment, although there ap¬ 
pear to be small differences between tectonic environments. 
Stress drops for interplate events average about 30 bars, 
whereas intraplate stress drops sometimes exceed 100 bars. 

There also seem to be differences among earthquakes at dif¬ 
ferent plate boundary types. Figure 4.6-12 shows M 0 ! T 3 , the 
ratio of seismic moment to the observed total time function 
duration (rise time plus rupture time), for some oceanic ridge, 
transform, and intraplate earthquakes. This quantity is approx¬ 
imately proportional to stress drop (Eqn 22) and is hopefully 
less model-dependent than stress drop estimates. For a given 
M w , the ratio seems smaller for transform earthquakes than 
for ridges, perhaps implying lower stress drops. 

Another way to use such data is to take the different 
magnitudes to study how energy release varies with frequency. 
Compared to ridge earthquakes, transform earthquakes often 
have large M s relative to m b and large M w relative to M s , sug¬ 
gesting that seismic wave energy is relatively greater at longer 
periods. Earthquakes that preferentially radiate at longer 
periods are called “slow” earthquakes. Slow earthquakes have 
been noted in various environments. For example, slow earth¬ 
quakes underwater in the appropriate locations and focal 
geometry can cause very large tsunamis (Section 1.2.4) that are 
not predicted by tsunami warning systems based on real-time 
assessments of m b or M s . 
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Fig. 4.6-12 Source parameters for some 
oceanic ridge, transform, and intraplate 
earthquakes. The transform earthquakes 
have relatively longer time functions and 
higher M s /m b , M w /M s , and M 0 /t 3 ratios, 
implying that they are “slow” earthquakes, 
perhaps with lower stress drop. (Stein 
and Pelayo, 1991. Reproduced with the 
permission of the Royal Society of London.) 


Source spectrum 
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Fig. 4.6-13 Theoretical source spectra for earthquakes with the same 
seismic moment and fault shape. For each pair of spectra with the same 
rupture velocity, the left curve for lower stress drop corresponds to larger 
fault dimensions, and hence longer time functions and smaller corner 
frequencies. This earthquake would be “slower” with less high-frequency 
radiation and lower M $ and m b . Similar effects occur for slower rupture 
velocity. The x axis is in frequency (Flz) rather than angular frequency (co ). 

Differences between m b , M s , and M w can reflect differences 
in stress drop. Figure 4.6-13 illustrates this using theoretical 
source spectra for earthquakes with the same seismic moment. 
For a given moment and fault shape, Eqn 17 shows that a lower 
stress drop corresponds to larger fault dimensions, and hence 


longer time functions and smaller corner frequencies. Thus, 
given two earthquakes with the same rupture velocity, the one 
with lower stress drop will have less high-frequency radiation, 
and thus lower M 5 and m b . Similar effects can result from a 
slower rupture velocity, which also gives a longer time function 
for a given fault dimension. These two possibilities can be dis¬ 
tinguished when the rupture velocity can be inferred from the 
relative time between sub-events, as in Figs 4.3-11 or 4.5-11. 

Thus the stress drop both characterizes earthquake source 
spectra and gives insight into the physics of faulting. From 
a source spectrum view, earthquake magnitudes saturate 
because the stress drop is essentially constant as earthquake 
moment increases, so the ratio of the slip to fault length re¬ 
mains constant. As a result, larger-moment earthquakes have 
longer faults and hence lower corner frequencies. From a fault 
mechanics view, the fact that the ratio of the slip to fault length 
is constant indicates that strain release in earthquakes is 
roughly constant, at about 

0~ 4 , (24) 

assuming a stress drop of 50 bars and fi = 5 x 10 11 dyn/cm 2 , 
which are average values for earthquakes in the crust and the 
upper mantle. 

This brings us to the important and unresolved issue, which 
will be discussed in Section 5.7, that the 10-100 bar stress 
drops found for earthquakes are much less than the strength of 
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rock found in laboratory friction experiments. One possibility 
is that the low stress drop reflects the average of highly variable 
slip over a fault plane, whereas strength is much higher in 
strong patches (sometimes called asperities) where the largest 
slip occurs. However, other data, such as the absence of heat 
flow anomalies at faults, also imply that faults are weaker than 
expected from the laboratory results. As a result, it is not clear 
whether earthquakes release most of the stress built up on a 
fault or only a small fraction of it. It is similarly unclear how 
to interpret the possible variations in energy release as a func¬ 
tion of period, and perhaps stress drop, in different tectonic 
environments. Intuitively, they might reflect interplate earth¬ 
quakes occurring more frequently than intraplate events on 
better-established, and perhaps thus weaker, faults. Similarly, 
established transforms may be weaker than newly formed 
near-ridge crust. 

This discussion leads naturally to the question of how the 
seismic wave energy radiated by an earthquake is related to its 
moment and magnitude. To address this, recall that work 
equals force times distance, so the strain energy released is the 
product of the average stress during faulting, ct, the average 
slip, and the fault area, 

W=aDS . (25) 

If the stresses before and after faulting are ct 0 and cq, then 
Act = <Tq — cq, and <7= cq + (Act)/ 2. Some of this energy, H, is lost 
to friction, so the radiated seismic energy is 

E=W~H = gDS - cqDS, (26) 

where cr^is the frictional stress, or 

E - (Act/ 2) D5 + (<q - Gf)DS = E 0 + (oq - oy)DS. (27) 

Thus the quantity 

E 0 -{Aa/2)DS = {Aal2fi)M 0 (28) 

is a lower bound on the radiated seismic energy (Fig. 4.6-14). 
If faulting stops once the final stress equals the frictional stress, 
cq = Op then £ 0 = £, the radiated energy. Note that the radiated 
energy is proportional to the stress drop. 

The ratio of the radiated energy to the total strain energy 
release is called the seismic efficiency , 

rj = E/W= Act/(2ct), (29) 

where the last form assumes that E 0 = E. The efficiency depends 
on the final stress or, equivalently, the ratio of stress drop to the 
average stress. The case Act « ct is called partial stress drop, 
whereas Act- 2ct corresponds to near-total stress drop. It is still 
unresolved which of these cases is appropriate for earthquakes, 
because, of all the parameters in this model, only the stress 
drop can be directly estimated from seismological data. 


Wi 



Fig. 4.6-14 Schematic illustration of the relation between the total strain 
energy released in faulting ( W) and its portions radiated seismically ( E ) 
and dissipated by friction (H). In the model, these depend on the initial and 
final stresses (<T 0 and cq), their average (ct), the stress drop (Act), and the 
frictional stress (cq). If the final stress equals the frictional stress, E 0 = E. 

Of these quantities, only stress drop can be estimated directly from 
seismological data. 

This model of the seismic energy radiated in earthquakes 
underlies the concept of moment magnitude. Assuming a stress 
drop of 50 bars and p- 5 x 10 11 dyn/cm 2 , as in Eqn 24, Eqn 28 
yields 

£ 0 = M 0 /(2 x 10 4 ), log E 0 = log M 0 - 4.3, (30) 

where £ 0 is in ergs. Inverting the definition of moment magni¬ 
tude Eqn 5 gives 

log M 0 = 1.5M W + 16.1, (31) 

so the second part of Eqn 30 becomes 

log E 0 = 1.5M W + 11.S. (32) 

This relation illustrates that an increase in earthquake magni¬ 
tude of 1 unit, for example from 5 to 6, increases the radiated 
energy by a factor of 10 1 * 5 , or about 32. Hence a magnitude 
7 earthquake releases 10 3 , or, 1000 times more energy than 
a magnitude 5 event. This ratio is strictly valid only for 
earthquakes with the same stress drop, but is a good general 
approximation. 

Equation 30 also illustrates the intriguing fact that although 
the seismic moment has the dimensions of energy (1 erg = 
1 dyn-cm), the radiated energy is only 1/(2 x 10 4 ), or 0.00005, 
of the seismic moment released. This is because the seismic 
moment is not an energy, but instead is fundamentally related 
to the integral of the stress change over the earthquake source 
region, which gives the moment dimensions of dyn/cm 2 x cm 3 , 
or dyn-cm. We can view Eqn 28 as converting the moment to a 
strain, and then multiplying by the stress acting during the 
earthquake to find the strain energy radiated. To illustrate that 
seismic moment and energy are different, seismic moment is 
quoted in dyn-cm (or N-m), and seismic energy is given in ergs 
(or J), even though the units are equivalent. 
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4.7 Earthquake statistics 

In discussing earthquake source parameters, we saw that inter¬ 
esting insights into earthquake processes, such as magnitude 
saturation and constant stress drop, came from considering 
general properties of large numbers of earthquakes. Hence 
we now turn to some ideas about the statistics of earthquake 
populations, which have implications for both source pro¬ 
cesses and hazard estimation. 1 

4. 7 .1 Frequency -magnitude relations 

As mentioned in Section 1.2, the number of earthquakes 
that occur yearly around the world varies with magnitude, with 
successively smaller earthquakes being more common. This 
observation was quantified by Gutenberg and Richter 2 in the 
1940s via the logarithmic earthquake frequency-magnitude 
relation 

log N = a 1 -bM, (1) 

in which N is the number of earthquakes with magnitude 
greater than M occurring in a given time. The distribution is 
described by a linear relation, with constants a 1 and b. It turns 
out that although the intercept, a v depends on the number of 
earthquakes in the time and region sampled, the slope, b , is 
generally about 1. This is shown in Fig. 4.7-1 for the nearly 
13,000 earthquakes with M s > 5 for the 30 years between 1968 
and 1997. There is an approximately tenfold increase in the 
number of earthquakes for successively smaller magnitudes: 
annually around the world there are about one M s = 8 earth¬ 
quake, 10 M $ = 7 events, 100 M s = 6 events, and so forth. 

A striking feature of this relation, sometimes called the 
Gutenberg-Richter relation , is that it also applies in individual 
seismic areas, with b generally about 1. Thus, although the 
number of earthquakes depends on how seismically active an 
area is, the relative frequency (M > 6 earthquakes about 10 times 
more common than M > 7, etc.) still applies. For example, in 
the past 1300 years Japan is estimated to have had about 190 
earthquakes with M > 7 and 20 with M > 8. Similarly, since 
1816 southern California has had about 180 earthquakes with 
M > 6, 24 with M > 7, and 1 with M > 8; whereas the New 
Madrid (central USA) seismic zone has had about 16 earth¬ 
quakes with M > 5 and 2 with M > 6. Although the precise 
numbers, especially for the rarer large earthquakes, depend on 
the period chosen and uncertainties in estimating magnitudes 



M s 

Fig. 4.7-1 Frequency-magnitude plot for all earthquakes with M 5 >5.0 
during 1968-97 listed in the catalog of the National Earthquake 
Information Center. The logarithm of the numbers of earthquakes as a 
function of magnitude gives a line with slope { b ) about 1. The values are 
shown both as a cumulative curve for the number of earthquakes per year 
with magnitude greater than or equal to a certain value and as incremental 
values in 0.1 magnitude unit bins. 

prior to the invention of the seismometer (in about 1890), the 
logarithmic decay still appears. 

Such a pattern, called fractal scaling, self-similarity, or scale 
invariance, is common in nature. For instance, a coastline or 
river drainage pattern looks similar when viewed at scales of 1, 
10, 100, or 1000 km. The idea that the distribution of earth¬ 
quake size is invariant with respect to scale except for the 
largest earthquakes is part of the rationale for the hypothesis 
that earthquakes are unpredictable, because there is no way to 
predict which small earthquakes will grow into large ones 
(Section 1.2.6). 

The frequency-magnitude relation applies not only to the 
cumulative number N of earthquakes greater than a given 
magnitude, but to the incremental numbers n in a magnitude 
range M to M + dM. To see this, we write Eqn 1 as 

N =10^~ bM , ( 2 ) 


1 Seismologists, like other geoscientists, have an ambivalence toward statistics, 
finding them valuable but often insufficient to require discarding models that do not 
rise to statistical significance. Often our attitude recalls the adage that statistics should 
be used as a drunk uses a lamp post — more for support than illumination. Sometimes 
this works; when asked whether he had statistically tested his exciting magnetic 
anomaly results showing symmetric sea floor spreading at mid-ocean ridges, F. Vine 
said that he never touched statistics but just dealt with facts (Menard, 1986). 

2 The many important contributions to seismology of Beno Gutenberg (1889-1960) 
and Charles Richter (1900-85) include quantifying global and regional seismicity. 


differentiate it with respect to M, and take the logarithm, so 


log 


'dN' 


dM 


= log n = a 2 ~ bM 


( 3 ) 


where a 2 is a new constant. Thus although the intercept a 
changes, the slope b stays constant. The data in Fig. 4.7-1 show 
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Fig. 4.7-2 Frequency-magnitude plot of all earthquakes during 1976-98 
with seismic moments measured by the Harvard CMT project. The slope 
(P) of this distribution (solid lines) is -2/3, consistent with a b value of 1. 
The values are shown both as a cumulative curve for the number of 
earthquakes per year with log M 0 greater than or equal to a certain value, 
and as incremental values in 0.1 log M 0 bins. 

this effect, although the linear fit is better for the cumulative 
values because the numbers are larger, and hence less affected 
by time sampling. Using more earthquakes by sampling longer 
intervals and/or larger areas produces better fits. Conversely, 
the shorter the time or the smaller the area, the more the fit 
is degraded by the statistics of small numbers, as discussed 
shortly. 

Although the data in Fig. 4.7-1 are generally well described 
by the linear relation, there are deviations. The data deviate 
from the b = 1 line for very small (M s < 3) magnitudes, because 
the global earthquake catalog is incomplete, with many small 
earthquakes not detected. The deviation for large (M s >7.5) 
earthquakes is expected, because the surface wave magnitude 
saturates (Fig. 4.6-6). To address this issue we can use the 
seismic moment, which better indicates the size of large earth¬ 
quakes. Using the definition of moment magnitude (Eqn 4.6.5) 
in Eqn 1 yields 

\ogN = a 1 -b(\og M q /1.5 - 10.73) = a-/Ilog M 0 . (4) 

This linear relation, with slope P = b/1.5 ~ 2/3, is shown in 
Fig. 4.7-2 for global earthquakes. The equation can also be 
written in an incremental form analogous to Eqn 3. 

The data in Fig. 4.7-2 deviate from the linear frequency- 
moment relation at large and small moments, just as the 
frequency-magnitude data did. The deviation for small earth¬ 
quakes is likely in part to be due to the incomplete earthquake 
catalog, but is also expected from energy considerations, as 
discussed in problem 20. Fiowever, the deviation at large 
moments is more puzzling, since moments do not saturate. For 
moments above 10 27 dyn-cm, the data are more consistent 
with P > 1 than /? = 2/3. In other words, there are fewer earth¬ 
quakes than expected for a given moment. 



Fig. 4.7-3 Log-log plot of fault length versus seismic moment. Most 
earthquakes fall between the solid lines with slopes of 1/3, showing M 0 
proportional to L 3 . However, strike-slip earthquakes (solid diamonds) 
have moments lower than expected for their fault lengths, because above a 
certain moment the fault width reaches a maximum, so the fault grows 
only in length. (Romanowicz, 1992. Geophys. Res. Lett., 19, 481-4, 
copyright by the American Geophysical Union.) 

A model for this phenomenon based on the concept of scale 
invariance assumes that the probability of an earthquake of 
a given size on a fault is inversely proportional to the area of 
faulting involved, so the number N of earthquakes with fault 
area greater than S should obey a frequency-area relation like 
those for magnitude or moment 

log N = c - log 5. (5) 

We saw in Eqn 4.6.17 that for constant stress drop the 
moment is proportional to S 3/2 , or the fault dimension L cubed, 
so we expect 

log N = c- 2/3 logM 0 , (6) 

which is consistent with the observations showing ft ~ 2/3. 
However, we have seen that for large transform fault earth¬ 
quakes, which occur on vertical faults, the width (down-dip 
extent) stays narrow even as fault length increases (Fig. 4.6-3). 
As a result, the seismic moment for such earthquakes is no 
longer proportional to L 3 , and is smaller than for other earth¬ 
quakes of comparable fault length, as shown in Fig. 4.7-3. If 
both the fault slip and the fault width no longer increase with 
length, then the fault area, moment, and number of earth¬ 
quakes should be proportional to L, so by Eqns 4 and 5 we find 
that p = 1. Such an increase is suggested by the data for the 
largest earthquakes in Fig. 4.7-2. 

The frequency-moment data give insight into earthquake 
energy release because the radiated energy is proportional to 
the seismic moment (Eqn 4.6.30). The few largest earthquakes 
release much more energy than the many smaller earthquakes. 
In fact, the largest earthquake in a given year often releases 
more energy than the rest of the year’s earthquakes. This effect 
is illustrated in Fig. 4.7-4, which shows the cumulative seismic 
moment release since 1976. The annual moment release by 
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earthquakes less than a given moment, such as M w < 7.5, is 
fairly constant. However, the total annual moment release 
shown by the jagged top curve, which averages about 3.5 x 
10 28 dyn-cm per year, is variable due to the occurrence of a few 
very large events. Using Eqn 4.6.30, this moment release corre¬ 
sponds to an annual energy release of about 2 x 10 24 erg or 
2 x 10 17 J. Thus in Table 1.2-1 we saw that the annual magni¬ 
tude 8 earthquake provides about half the total annual seismic 
energy released, and that successively smaller, but more corn- 



1976 1980 1984 1988 1992 1996 2000 

Year 

Fig. 4.7-4 Cumulative seismic moment for the earthquakes in Fig. 4.7-2. 
The total global seismic moment release is dominated by the few largest 
events. The total moment for 1976-98 is about 1/3 that of the giant 1960 
Chilean earthquake. 


mon, earthquakes contribute less, so the contribution of earth¬ 
quakes with magnitudes less than 6 is negligible. 

Although b values approximately equal 1 over long time 
scales and large spatial scales, significant variations occur on 
smaller scales. The b value of earthquake swarms is often much 
larger than 1, sometimes approaching 2.5. These swarms, 
which lack a mainshock, are often associated with volcanic 
regions, and may result from processes such as the migration 
of magmatic fluids or caldera development. For example, 
seismicity associated with the collapse of the Fernandina 
caldera in the Galapagos Islands in 1968 had b * 1.9, indicating 
many small earthquakes but fewer large ones than expected. 

The b value also varies regionally, both spatially and with 
depth. Figure 4.7-5 shows the variation in b value on a segment 
of the Calavaras fault in California. Some patches have b values 
much less than 1, implying shorter recurrence time. These 
patches have been interpreted as possible asperities or stress 
concentrations, perhaps reflecting variations in frictional prop¬ 
erties along the fault, which may control the recurrence of the 
next large earthquake and have large moment release during it. 

Other intriguing possible deviations from 6 = 1 have been 
reported. Figure 4.7-6 (left) shows earthquake magnitudes 
and frequencies for large earthquakes inferred from geological 
paleoseismic studies, which deviate from the seismologically 
determined frequency-magnitude data. These observations have 
been interpreted as showing large (sometimes termed charac¬ 
teristic) earthquakes more frequent than would be expected 
from the linear relation derived from the instrumental data, 
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Fig. 4.7-5 Variation of b values for small 
earthquakes with depth and distance along 
the Morgan Hill segment of the Calaveras 
fault during 1971-84. Regions with low b 
values may have a shorter time until the 
next large earthquake. The 1984 Morgan 
Hill earthquake occurred in a region of 
low b values. (Wiemer and Wyss, 1997./. 
Geophys. Res., 102 ,15,115-28, copyright 
by the American Geophysical Union.) 




Fig. 4.7-6 Deviations from a linear 
frequency-magnitude relation. Left : 
Paleoseismic results (box) for the Wasatch 
fault zone (Utah) showing large earthquakes 
more frequent than expected from the 
instrumental seismicity (dots). The solid line 
is a model for this effect. (Youngs and 
Coppersmith, 1985. © Seismological Society 
of America. All rights reserved.) Right: 
Incremental frequency-magnitude data for 
instrumentally studied earthquakes in 
continental interiors, showing larger 
earthquakes less frequent than expected 
from the smaller earthquakes. (Triep and 
Sykes, 1997./. Geophys. Res., 102, 9923- 
48, copyright by the American Geophysical 
Union.) 
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Fig. 4.7-7 Numerical simulation showing apparent deviations from a 
linear frequency-magnitude relation resulting from small sample size. 
When data satisfying a linear relation (upper solid line) are divided into 
subsets, most subsets have largest earthquakes either above or below the 
ideal linear relation for the subsets (lower solid line). (Howell, 1985. 

© Seismological Society of America. All rights reserved.) 

although the opposite has also been observed (Fig. 4.7-6, 
right). It is not yet clear whether these effects are real or due 
to differences in magnitude and frequency estimation between 
seismological and geological approaches. A study using seis¬ 
mological data for continental interiors finds that the largest 
earthquakes are less frequent than expected from the smaller 
earthquakes (Fig. 4.7-6, right). These observations are inter¬ 
preted as showing a possible small deviation toward higher 
frequency at about M w 7, followed by a significant decrease as 
observed in the global data (Fig. 4.7-1), presumably due to 
finite fault width. The Gutenberg-Richter relation can be 
modified to describe the different deviations from linearity. 

Some deviations of the largest earthquakes in an area from a 
linear frequency-magnitude relation may reflect small sampling 
(Section 6.5.2). Figure 4.7-7 illustrates this effect by dividing 
an earthquake population that follows a Gutenberg-Richter 
distribution into ten subsets. Because only one subset contains 
the largest earthquake, and some have a much smaller largest 
earthquake, the frequency-magnitude relations for the subsets 
have considerable scatter. The largest earthquakes appear in 
some cases more and in other cases less frequent than for the 


total population. Thus the b value is reasonably well estimated 
from the smaller earthquakes, but not the largest ones. This 
effect may occur if the number of large earthquakes in the 
study is small. For example, we might expect this for indi¬ 
vidual faults, even in a region whose overall seismicity obeys 
a Gutenberg-Richter distribution. 

A final point worth noting is that although the Gutenberg- 
Richter distribution predicts the frequency of arbitrarily large 
earthquakes, such earthquakes may not actually occur. As we 
have seen, the area available for faulting limits earthquake size. 
For example, we will see in Section 5.3.3 that the maximum 
moment of mid-ocean ridge earthquakes varies inversely with 
spreading rate. As a result, regional studies often assume the 
existence of a maximum magnitude, sometimes based on the 
earthquake history, in the Gutenberg-Richter distribution. 
This assumption has interesting implications, because on many 
plate boundaries the motion inferred from earthquake histories 
seems to be significantly less than the plate motion (Sections 
5.3.3, 5.4.3, 5.6.2). Hence, either the missing motion occurs in 
very large, rare earthquakes, or much of the motion occurs by 
aseismic processes. Which of these is the case is also of interest 
for seismic hazard studies. 


4.7.2 Aftershocks 

The smaller aftershocks following a mainshock have a charac¬ 
teristic distribution in size and time. As previously noted (e.g., 
Fig. 4.5-9), most aftershocks occur on or near the mainshock’s 
fault plane, so their locations are used to distinguish between 
the fault and auxiliary planes and to estimate the fault area. 
The largest aftershock is usually more than a magnitude unit 
smaller than the mainshock, and the aftershocks have a size 
distribution with b near 1, so the total energy released by the 
aftershocks is usually less than 10% of that of the mainshock. 

Most of the aftershocks occur soon after the mainshock, and 
the remainder decay with time in a quasi-hyperbolic manner. 
This decay is described by a relation now called Omori’s law, 3 


C 

ft = -7; 

(R + ff ’ 


( 7 ) 


where n is the frequency of aftershocks at a time t after the 
mainshock, with K, C, and P as fault-dependent constants. P is 
typically about 1. This decay is illustrated by the aftershocks of 
the 1989 (M s 7.1) Loma Prieta earthquake (Fig. 4.7-8). 

The aftershock decay is thought to reflect stress readjustment 
following the stress changes due to the main shock. An intrigu¬ 
ing exception to Omori’s law is that most deep earthquakes 
have many fewer, and often no, detected aftershocks. This dif¬ 
ference may reflect deep earthquakes resulting from phase 


3 Fusakichi Omori (1868-1923), considered the founder of Japanese seismology, 
participated in the commission that studied the 1906 San Francisco earthquake 
(Section 4.1) and correctly assured worried citizens that no comparable earthquake 
would be expected for at least the next 50 years. 
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Magnitude 

Number 

Effect 

5 

2 

Damaging 

4 

20 

Strong 

3 

65 

Perceptible 

2 

384 

Not felt 

1 

1855 

Not felt 

<1 

2434 

Not felt 

Total 

4760 



4760 aftershocks of the Loma Prieta 
earthquake had been recorded by noon 
on November 7, 1989. The diminishing 
number of aftershocks with time is 
typical for large California earthquakes. 
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Fig. 4.7-8 Graphic showing the number 
and distribution of aftershocks in 22 days 
after the 1989 Loma Prieta earthquake as 
functions of magnitude and time, {Courtesy 
of US Geological Survey.) 


changes in mantle minerals (Section 5.4.2), which could pro¬ 
duce slip only once on a fault surface, in contrast to frictional 
sliding, which can recur. 

4. 7 .3 Earthquake probabilities 

A natural use of earthquake statistics is to estimate the pro¬ 
bability of future earthquakes. These probabilities are interest¬ 
ing from the standpoint of earthquake physics, and crucial 
for attempts to forecast the hazards due to large, damaging 
earthquakes (Section 1.2.5). 

The challenge of estimating earthquake probabilities can be 
illustrated by a simple analogy. Problems in probability are 
often couched as games of chance, but earthquakes have the 
special feature that the game’s rules are unknown. To see this, 
consider estimating the probability that particular playing 
cards will be dealt from a deck. If the game begins with a full 
deck, there is a 25% (13/52) chance of drawing a spade, an 8% 
(4/52) chance of an ace, and a 2% (1/52) chance of the ace of 
spades. These chances are analogous to the prospects of having 
a magnitude 6, 7, or 8 earthquake in a year. As play continues, 
there are several possible cases. If the deck is shuffled after 
every draw, the probabilities do not change. Alternatively, if 
the deck is not shuffled, the probabilities change depending on 
the cards that have been drawn. For example, if no aces have 
yet appeared, the probability of an ace increases with each 
draw. However, if cards are dealt from under the table, we do 
not know what cards the deck began with (there may be no aces 
or eight of them) and whether it is shuffled. We must infer what 
the deck contains, how it is shuffled, and what cards will appear, 
with no information except the cards already drawn. Hence 
if no aces have appeared after a large number of draws, the 
probability of an ace may be high (because the remaining cards 
contain several) or low (because the starting deck had few). 


In the nomenclature of probability theory, the probability of 
events depends on the probability density distribution that is 
sampled and the sampling method. For earthquakes, we know 
neither because we do not have a theoretical model that suc¬ 
cessfully describes earthquake recurrence, so we adopt prob¬ 
ability distributions based on the earthquake history which for 
most faults is short (only a few recurrences) and complicated. 
As a result, various distributions grossly consistent with the lim¬ 
ited history are used and can produce quite different estimates. 

The simplest model describes earthquake occurrence by a 
Poisson distribution often used to describe rare events. 4 We 
assume that the probability of n large earthquakes in an area 
or on a fault during time t is 

p(n, t, z) = (t/z) n e~ tlT /nl, (8) 

where 1/z is the number expected in a year from the regional 
Gutenberg-Richter distribution or some variant, so z is the 
mean recurrence time. The probability of one or more earth¬ 
quakes is found from the probability that none will happen, 
using the certainty (p = 1) that an earthquake either will or will 
not happen, so 

p(rt > 1, t, z) = 1 - p(0, t, z) = 1 - e~ tlx ~ tlz , (9) 

where the last step used the Taylor series expansion e x ~ 1 - x, 
and so is valid for t« z. In this model, the probability that an 
earthquake will occur in an interval of time t starting from now 
does not depend on when “now” is, because a Poisson process 
has no “memory.” On average, earthquakes are separated by 
time T, but when the last earthquake occurred has no effect. 

4 Examples include volcanic eruptions, radioactive decay, and the number of 
Prussian soldiers killed by their horses. 
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The Poisson model is the simplest null hypothesis against 
which we can compare other models. However, its time- 
independence in which earthquakes are implicitly random 
events is not appealing, because almost all of our seismological 
instincts favor earthquake cycle models, in which strain builds 
up slowly from one major earthquake to the next. 5 In this case, 
the probability of a large earthquake should be small immedi¬ 
ately after a large earthquake, and then grow with time. This is 
described by time-dependent models in which the probability 
of a large earthquake a time t after the past one is given by a 
probability density distribution p(t, t, a) that depends on the 
average and variability of the recurrence times, described by 
the mean t and the standard deviation a. In other words, p 
gives the probability that the recurrence time for this earth¬ 
quake will be t, given an assumed distribution of recurrence 
times. The cumulative probability that the earthquake will 
occur by time T since the past earthquake is found by integrat¬ 
ing the density function 


r 

r 


P(T) = 


p(t , t, a)dt. 


( 10 ) 


o 


We seek to estimate how likely an earthquake is between 
now and some future time. Formally, this is the conditional 
probability that the earthquake will occur between time T 0 
(now) and a future time T, given the condition that it has not 
yet happened by time T 0 . To do this, we use Bayes's theorem , 
which states that P(A\B), the conditional probability of event 
A given that event B has occurred, is the ratio of the joint 
probability P(A, B) of both A and B to P(B), the probability of 
event B: 


P(A\B) = P(A,B)/P(B). (11) 

In this case, the conditional probability C(T, T 0 ) that the earth¬ 
quake will occur between T 0 and T is the ratio of the probabil¬ 
ity that it will occur in that interval to the probability that it has 
not yet happened by To, which is just 1 minus the probability 
that it has. Hence 




Fig. 4.7-9 Earthquake probability estimate for a segment of the San 
Andreas fault on which the last major earthquake occurred in 1857. 

Top: Probability density functions, with the interval 1983-2003 shaded. 
The dashed line is for a Gaussian distribution, with mean and standard 
deviations of 194 and 58 years, and the solid lines are for an alternative 
(Weibull) distribution. Bottom: Conditional probability that the next 
large earthquake will occur in the next 20 years, as a function of time 
since 1857. As of 1983 (arrow), the probabilities for the time-dependent 
models were comparable to those for a time-independent Poisson model. 
(Sykes and Nishenko, 1984./. Geophys. Res., 89,59 05-27, copyright by 
the American Geophysical Union.) 
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C(T, T 0 ) = (P(T) -P(T 0 ))/(1 -P(T 0 )). (12) 

The denominator is less than one, so the conditional prob¬ 
ability is greater than the joint probability (numerator) because 
the fact that the earthquake has not happened makes it more 
likely. 

This approach can be used with any assumed probability 
density function. The simplest is to assume that earthquake 
recurrence follows the familiar Gaussian or normal (bell curve) 
distribution (Section 6.5.1) 

5 Of course, these instincts favoring determinism may ultimately prove incorrect — 
Einstein initially rejected quantum mechanics, arguing that “God does not play dice.” 


This distribution is often described using the normalized vari¬ 
able z = (t - t)/<7 describing how far, in terms of the standard 
deviation, t is from its mean. 

Figure 4.7-9 shows such an analysis for the segment of the 
San Andreas fault including the Pallett Creek site (Fig. 1.2-15), 
on which the last major earthquake was the 1857 Fort Tejon 
earthquake. The analysis uses a Gaussian distribution with a 
mean and standard deviation of 194 and 58 years, correspond¬ 
ing to the most recent five major earthquakes. The upper panel 
shows the probability density function for this distribution 
(dashed line) and two others. These are used to estimate the 
conditional probability that a major earthquake would occur 
between 1983 (the study time) and 2003. These times are 126 
and 146 years since 1857, and so correspond to normalized 
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Fig. 4.7-10 Synthetic earthquake histories computed by sampling a 
Poisson model with the recurrence time of 194 years and a Gaussian 
model with this recurrence time and standard deviation 58 years. The 
Gaussian model yields a more periodic series, whereas the Poisson model 
yields clustering. 

times of-1.17 and -0.83, with probabilities of 0.12 and 0.20. 
Thus the conditional probability (Eqn 12) is 

C{2003,1983) = (P(2003) -P(1983))/(l -P(1983)) 

= (0.20 - 0.12)/(1.0 - 0.12) = 0.09, (14) 

or 9%. The probability for successive 20-year intervals 
increases with time, and so is 29% if the earthquake has not 
occurred by 2057, and 56% if it has not occurred by 2157. 

It is interesting to compare these time-dependent pro¬ 
babilities to those predicted by the time-independent Poisson 
model. For an assumed mean recurrence time of 194 years, the 
probability in 20 years is 10%. Thus for times since the previ¬ 
ous earthquake less than about 2/3 of the assumed recurrence 
interval, the Poisson model predicts higher probabilities. At 
about 2/3 of the interval, in this case about 1986, the models 
predict comparable probabilities. At later times the Gaussian 
model predicts progressively greater probabilities. This com¬ 
parison illustrates the seismic gap concept: a gap exists when 
it has been long enough since the last major earthquake that 
time-dependent models predict an earthquake probability much 
higher than expected from time-independent models. 

The differences between the models can be illustrated by 
comparing the earthquake histories that each predicts. Figure 
4.7-10 shows synthetic earthquake histories generated by ran¬ 
domly sampling probability distributions with the parameters 
used in Fig. 4.7-9. In the simulation, both models yield ten 
earthquakes after an earthquake at time zero. The earthquakes 
from the Poisson model have a mean recurrence of 189 years 
and a standard deviation of 107 years, whereas those for the 
Gaussian model have a mean and standard deviation of 191 
and 58 years, respectively. The difference results from the fact 
that the Poisson process is time-independent, so there are both 
shorter and longer intervals between earthquakes than for the 
Gaussian process, which is more regular. The Poisson process 


thus shows clustered earthquakes resulting from the random 
sampling. In the limit of very long histories, the Poisson process 
has a standard deviation of recurrence intervals equal to its 
mean. Thus a recurrence history with standard deviation close 
to the mean favors a Poisson process, whereas a standard devi¬ 
ation significantly smaller than the mean suggests a Gaussian 
or other time-dependent process. How to interpret the limited 
earthquake histories available is an interesting question, as 
illustrated by this simple example with ten recurrences, which 
is longer than usually available. 

These examples bear out that estimates of earthquake prob¬ 
abilities depend significantly on both the probability distribu¬ 
tion used and the parameters for that distribution, which are 
generally not well constrained by observations. For example, 
the analysis in Fig. 4.7-9 used a Gaussian distribution with a 
mean and standard deviation of 194 and 58 years, correspond¬ 
ing to the most recent five major earthquakes at Pallett Creek. 
Alternatively, the past ten earthquakes there yield a recur¬ 
rence with a mean and standard deviation of 132 and 105 years 
(Section 1.2.5). Other probability distributions give different 
probability estimates, as illustrated by the curves in Fig. 4.7-9 
corresponding to Poisson and Weibull distributions. Similarly, 
different estimates would result from using a log-normal dis¬ 
tribution in which the natural logarithm of recurrence time is 
normally distributed, so recurrence intervals longer than the 
mean are more likely than shorter ones. 

Hence earthquake forecasts are easy to make, but hard to 
test. Because the estimates must be tested using data that 
were not used to derive them, hundreds or thousands of years 



Fig. 4.7-11 Portion of the seismic gap map (McCann et al ., 1979) used 
by Kagan and Jackson (1991) to test the gap hypothesis. The shaded 
segments of the plate boundaries had been assigned seismic potentials of 
high (red, R), intermediate (orange, O), and low (green, G). Unshaded 
segments were regarded as having uncertain potential. During the ten 
years following the map’s publication, ten large (M > 7) earthquakes 
(dots) occurred in these regions. None were in the high- or intermediate- 
risk segments, and five were in the low-risk segments. (Stein, 1992. 
Reproduced with permission from Nature.) 
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Fig. 4.7-12 Conditional probabilities of 
major earthquakes estimated for segments 
of the San Andreas fault for the period 
1988-2018. (Agnew etal., 1988. Courtesy 
of the US Geological Survey.) 



(multiple recurrences) will be needed to assess how well various 
models predict large earthquakes on specific faults or fault 
segments. The first challenge is to show that a model predicts 
future earthquakes significantly better than the simple time- 
independent Poissonian model. 

Given human impatience, attempts have been made to con¬ 
duct alternative tests using smaller earthquakes or many faults 
over a short time interval. To date, the results are not encourag¬ 
ing. As discussed in Section 1.2.5, the history of relatively small 
(M 5-6) earthquakes near Parkfield, California, was used in 
1985 to predict at 95% confidence level that the next one would 
occur by 1993, whereas the earthquake has not materialized to 
date (2002). Presumably the earthquake will occur eventually, 
although its conditional probability seems to have been over¬ 
estimated and might even be assumed to be decreasing, because 
the longer the earthquake is delayed, the longer the mean recur¬ 
rence interval inferred from the earthquake history becomes. 6 
Moreover, a global test of the seismic gap hypothesis, which 
examined how well a gap map (Fig. 4.7-11) forecast the loca¬ 
tions of major earthquakes, found that the map did no better 
than random guessing. In fact, many more large earthquakes 
occurred in areas identified as low risk than in the presumed 
higher-risk gaps. This result, which appears inconsistent with 

6 This situation, discussed by Davis et al. (1989), has been likened to waiting for a 
bus — the longer the bus fails to arrive, the less likely its arrival seems. A homework 
problem illustrates these issues. 


ideas of earthquake cycles and seismic gaps, has led to various 
interpretations, including that the gap model applies only to the 
largest events that break major portions of the plate boundary. 

Perhaps the most sophisticated large-scale earthquake pro¬ 
bability studies have been in California. Figure 4.7-12 shows 
conditional probabilities estimated along segments of the San 
Andreas fault. Such models can also include factors such as 
variable slip in earthquakes and stress changes due to nearby 
earthquakes (Section 5.7). Testing more complicated models 
with more adjustable parameters, however, will be even more 
challenging and take even longer. 

Hence, at present, estimates of earthquake probabilities have 
large uncertainties. For example, using the complex Pallett 
Creek earthquake series (Fig. 1.2-15), in 1989 the range of 
probabilities for a major earthquake before 2019 was estim¬ 
ated as about 7-51%. 7 Thus it has been suggested that it is 
only meaningful to quote probabilities in broad ranges, such as 
low (<10%), intermediate (10-90%), or high (>90%). 8 How¬ 
ever, despite these formidable difficulties, estimation of earth¬ 
quake probabilities seems certain to remain an active research 
area. If some probability model is ultimately demonstrated 
to be reasonably successful, its use could advance efforts to 
estimate earthquake hazards. 

7 Sieh etal. (1989). 

8 Savage (1991). 
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Further reading 

Other treatments of earthquake sources are given by texts such as 
Ben-Menahem and Singh (1981), Gubbins (1990), Lay and Wallace 
(1995), and Shearer (1999). Many of the results presented here without 
proof are derived in Aki and Richards (1980). 

Some specific topics are covered in individual reviews. Kanamori (1994) 
gives an overview of earthquake source parameters and earthquake 
mechanics; papers in Kanamori and Boschi (1983) review various topics 
about earthquake sources. Structural geology texts such as Ragan (1968) 
discuss stereonet techniques. Jarosch and Aboodi (1970) derive analytic 
expressions for the relations between the fault and auxiliary planes and 
the stress axes. Helmberger and Burdick (1979), Kanamori and Stewart 
(1976), and Okal (1992) discuss body wave modeling. For reasons includ¬ 
ing compatibility of notation, our treatment of body wave modeling fol¬ 
lows the latter two, that for surface wave modeling follows Kanamori and 
Stewart (1976), and that for moment tensor inversion follows Kanamori 
and Given (1981). Jost and Hermann (1989) give a general review of 
moment tensor inversion, and Dziewonski et al. (1981) summarize the 
Harvard GMT approach. Okal and Geller (1979) explore spurious iso¬ 
tropic moment tensor components due to lateral heterogeneity, Michael 
and Geller (1984) discuss inverting surface wave data with one nodal plane 
constrained, and Romanowicz and Guillemant (1984) discuss inverting 
surface waves for depth determination. 

Opposing (double-couple versus slump) source models for the 1929 
Grand Banks earthquake are explored by Hasegawa and Kanamori (1987) 
and Bent (1995); Tappin et al. (1999) discuss a slump origin for the 1998 
New Guinea tsunami. Julian and Sipkin (1985) and Wallace (1985) con¬ 
sider CLVD versus double-couple models for earthquakes in the Long 
Valley caldera. Heaton and Hartzell (1988) discuss source study using 
near-field earthquake ground motions. 

General treatments of geodesy include those by Lambeck (1988) and 
Torge (1991). Geodetic solutions for faults are given by Okada (1985). 


Mavko (1981) reviews fault models and the use of geodetic data to study 
faulting, and Burgmann et al. (2000) review the use of radar interfero¬ 
metry. References for topics related to the tectonic setting of earthquakes, 
use of the Global Positioning System, and the relation of earthquakes to 
fault mechanics are given at the end of Chapter 5. 

A detailed discussion of earthquake magnitudes is presented by Geller 
and Kanamori (1977). Relations between fault parameters are given in 
Kanamori and Anderson (1975); source spectra and scaling laws are dis¬ 
cussed by Geller (1976). Our treatment of moment magnitude and earth¬ 
quake energy follows Kanamori (1977a) and Hanks and Kanamori (1979). 
Atkinson and Beresnev (1997) discuss the relation between stress drop as 
a source parameter and as a tectonic quantity. Okal and Romanowicz 
(1994) give an overview of frequency-magnitude relations. Turcotte 
(1992) and Main (1996) review self-similar models for earthquakes. 
References for topics related to earthquake forecasting and seismic gaps 
are given at the end of Chapter 1. In particular, Kagan and Jackson (1991) 
discuss the challenge of testing forecasts. 

A voluminous literature deals with studies of individual earthquakes, 
especially those that are of special interest because of their size, dam¬ 
age, tectonic setting, or location near centers of seismological research. 
Some recent examples include issues of the Bulletin of the Seismological 
Society of America dealing with the 1989 Loma Prieta (October 1991 
issue), 1992 Landers (June 1994 issue), and 1994 Northridge (Febru¬ 
ary 1996 issue) earthquakes. Detailed studies of other earthquakes can 
often be found using the American Geological Institute’s Georef WWW 
search tool, available through many earth science departments and 
libraries. The locations and focal mechanisms of post-1977 earthquakes 
around the world are available at http://www.seismology.harvard.edu/ 
CMTsearch.html, and information about earthquakes in specific areas, 
including seismograms, can often be found at WWW sites compiled 
at http://www.geophys.washington.edu/seismosurfing.html or http:// 
www.iris.edu. 


i— Problems 




1. Using the travel time chart in Fig. 3.5-4 for earthquakes at a depth 
of 600 km, graph the take-off angle of the P wave for stations at 
distances from 2000 to 10,000 km. Assume that the P velocity at 
600 km depth is 10 km/s. Use enough points for a smooth graph. 

2. Plot the following focal mechanisms on a stereonet by using the 
relations in Section 4.2.5 to find the second nodal plane. Indicate 
the compressional and dilatational quadrants, mark the P and T 
axes, and describe the type of faulting. Use the conventions of 
Fig. 4.2-2, and remember that dip is defined from the -x 2 axis 
and is less than 90°. 

(a) 0 = 330°, 3= 65°, X- 70° 

(b) 0-280°, 8= 60°, 2=270° 

(c) 0 = 280°, 8= 60°, X- 90° 

(d) 0 = 40°,<5=80°,A=20° 

(e) 0 = 40°, 5- 80°, 2=200° 

3. Figure P4.1 gives a stereonet and first motion data for four earth¬ 
quakes on stereonets of the same scale. Closed circles show com¬ 
pressions, and open circles show dilatations. To evaluate the focal 
mechanism for each earthquake: 

(i) Find nodal planes that you consider the best solution. Show 
these planes on the first motion plots, and measure their 
strikes and dips. 

(ii) Find two planes bounding the acceptable range for each 
nodal plane. 



(iii) For each best choice nodal plane, give the motion (right- 
lateral strike-slip, left-lateral strike-slip, dip-slip - thrust or 
normal) implied by the focal mechanism for slip on that 
nodal plane. If the faulting is a combination of the above, 
give the dominant type. 

(iv) Find the B, P, and T axes and the two possible slip angles 
(one for each nodal plane) implied by the best choice nodal 
planes. Check that these are consistent with the answers to 
part iii. 

4. If a P wave leaves the focal sphere exactly on a nodal plane, it 
should theoretically have zero amplitude. Explain why this is not 
the case in reality. 

5. Derive the travel time for sP (Eqn 4.3.11) using a geometry similar 
to that of Fig. 4.3-6 {bottom). 

6. For a fault plane solution in which one plane has strike 0 2 and 
dip 5 1? and the second plane is striking at 0 2 , show that tan A x = 
cot (0 2 - 0j)/cos <5 r For what angles will this not apply? 

7. Compute the Love wave amplitude radiation pattern for an 
isotropic source. 

8. Use the expression for the moment tensor of a double couple 
(Eqn 4.4.5) to prove that it obeys the tensor transformation law 
(Eqn 2.3.18). 

9. (a) Show how the moment tensor for a vertical dipole can be 

decomposed into an isotropic source and a CLVD. 
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Fig. P4.2 See problem 13. 



(b) Using the decomposition in Eqn 4.4.48, decompose the 
diagonalized moment tensor in Eqn 4.4.47 into a double 
couple and a CLVD. Find the ratios of the double-couple scalar 
moment and CLVD scalar moment to the scalar moment of the 
original tensor. 

(c) Give an alternative decomposition to Eqn 4.4.48 that makes 
the double couple smaller and the CLVD larger. Use this 
decomposition on the diagonalized moment tensor in Eqn 
4.4.47, and find the ratios of the double-couple scalar moment 
and the CLVD scalar moment to the scalar moment of the 
original tensor. 

10. Show for an infinite buried strike-slip fault extending from depth 
w to depth W that the maximum coseismic surface displacement 
occurs at distance y = {wW) m from the fault. 

11. Assume that a geodetic position is measured with an uncertainty of 
3 mm. How precise will estimates of its velocity be after 1, 5, and 
10 years of measurements? 

12. (a) Using the analytic expression for an interseismic velocity pro¬ 

file across a strike-slip fault, define a criterion to estimate the 
fault locking depth. 

(b) Use this criterion to estimate the locking depth for the 
GPS velocity profile across the San Andreas fault shown in 
Fig. 4.5-13. 

(c) For this profile, estimate the far-field slip rate. 

(d) Use the analytic expression to find the rate that would be 
estimated by measuring the velocity at this location, but on a 
baseline extending only 5 km on either side of the fault. 

13. Use the seismogram in Fig. P4.2 to determine the surface wave 
magnitude of the earthquake. The scale bar indicates 1 cm on the 
seismogram. Assume that the seismometer’s magnification is 3000, 
and that the earthquake is 17° away. 

14. Use the fault parameters given for the earthquakes in Table 4.6-1 
and the theoretical relations in Eqns 4.6.18-20 to estimate the 
stress drop for each. Use all three geometries, and note which seems 
most geologically appropriate. (Part of this is done for the 1964 
Alaska earthquake in the text.) How does the inferred stress drop 
depend on the assumed geometry? 

15. Assume that the largest earthquakes on the San Andreas fault have 
the same fault width (10 km) and average slip (4 m) as estimated 
for the 1906 earthquake. How long would the fault have to be for 
these earthquakes to have the same seismic moment as the 1960 
Chilean or 1964 Alaska earthquakes (Table 4.6-1)? Compare this 
value to the length of the San Andreas fault (Fig. 5.2-3). 

16. Plot log S versus log M 0 , as in Fig. 4.6-11, for the six earthquakes in 
Table 4.6-1. If you fit a line through these six points and assume a 
constant stress drop, does the slope agree with Eqn 4.6.17? 

17. For the observed earthquake source spectrum in Fig. 4.6-8, estim¬ 
ate the corner frequency. Making the necessary assumptions, 
estimate a source dimension and stress drop. Given the different 


assumptions and models possible, your values are likely to differ 
from the 30 km and 65 bars inferred by the study shown. 

18. M s magnitudes are usually measured at a period of 20 s. If they 
were measured at 30 s instead, would M s values saturate at a higher 
or a lower value than usual M s values, and why? 

19. (a) Derive Eqn 4.6.29 for the seismic efficiency. 

(b) Assuming that the average stress in the earth during faulting 
is 1.5 kbar, estimate the seismic efficiency for a typical earth¬ 
quake? What does this say about the fraction of the strain 
energy that goes into seismic waves? 

20. The largest earthquakes release more total energy than smaller 
events, because if all the magnitude 6s released more energy than 
the magnitude 7s, the magnitude 5s released more energy than 
the magnitude 6s, and so on, then the seismic energy released by the 
smailest-magnitude events would approach infinity. What is the 
largest possible global value of b without this impossible scenario 
occurring, if b were constant down to very small magnitudes 
(which it is not) ? 

21. From the values given in Section 4.7.1, estimate the mean recur¬ 
rence time for earthquakes with magnitudes greater than 6, 7, and 
8 in Japan, southern California, and the New Madrid seismic zone. 

22. Using only the instrumental data in Fig. 4.7-6, estimate the recur¬ 
rence interval for an earthquake with magnitude 7.5 or greater in 
the Wasatch fault zone (Utah). Compare this estimate to that 
shown for the paleoseismic data. 

Computer problems 

C-l. (a) Write a subroutine to compute the elements of a fault’s 
normal vector and slip vector given the three fault angles. 

(b) Use this routine to compute n and d for the focal mechan¬ 
isms in problem 2. Compare your results to those obtained 
from the stereonet. 

(c) Test numerically that n and d for all these mechanisms are 
orthogonal. A subroutine from the computer problems in the 
Appendix, C-4, can be used. 

C-2. (a) Write a subroutine to compute the elements of vectors in the 
directions of the P and T axes using the results of C-l. 

(b) Use this routine to find the directions of the P and T axes for 
the focal mechanisms in problem 2. Compare your results to 
those obtained from the stereonet. 

C-3. (a) Write a subroutine to compute the elements of the moment 
tensor using the results of C-l. 

(b) Use this routine to find the moment tensors for the focal 
mechanisms in problem 2. 

C-4. (a) Write a subroutine to convert the elements of the moment 
tensor to P and T axes by diagonalizing the tensor. The 
eigenvalue-eigenvector routine from the Appendix, prob¬ 
lem C-l2, may be useful. 
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Fig. P4.3 See problem C-6. 


(b) Use this routine to find the directions of the P and T axes for 
the focal mechanisms in problem 2. Compare your results to 
those obtained in C-2. 

C-5. Write subroutines to generate the amplitude radiation patterns 
for Love and Rayleigh waves. Use these, with values of the 
excitation functions 

P l = -2.75, Q l = -0.34, and S R = 4.0,P R = 2.7, Q R = -1.6 

to replicate the examples of Fig. 4.3-12. 

C-6. Figure P4.3 shows three ways to evaluate integrals numerically. 
To see how these work: 

(a) Analytically integrate the function y = x 1 over the interval 
0 <x< 10. 

(b) Write a subroutine to numerically integrate this function 
using inscribed rectangles as in Fig. P4.3a. Try this with 
intervals of 2 (as shown) and 0.02. What is the percentage 
difference between these results and the true value in part 

(a)? 

(c) Repeat (b) using intermediate rectangles, as shown in 
Fig. P4.3b. 

(d) Repeat (b) using trapezoids, as in Fig. P4.3c. 

C-7. (a) Write a subroutine that uses one of the methods in C-6 to 
integrate the Gaussian probability function p(f, t, cr) (Eqn 
4.7.13) over an interval from -t to t. 

(b) Use the subroutine to find the integral of p{t, T, a) (Eqn 
4.7.13) over the interval -10 < t< 10 with t=0 and O'- 5, and 
explain the result. 

C-8. (a) Write a program to estimate the conditional probability, 
using Gaussian and Poisson models, that an earthquake will 
occur in a specified time interval, given the time of the last 
earthquake and the mean and standard deviations of the 
recurrence time. The routine in C-7 will be useful for the 
Gaussian model. 

(b) Check the routine using the San Andreas example in Fig. 
4.7-9 for 20-year periods beginning in 1983, 2057, and 
2157. 


(c) Calculate the values for the same periods, but using a mean 
recurrence of 132 years and a standard deviation of 105 
years, which correspond to the full Pallett Creek earthquake 
series. Explain how and why the results change. 

C-9. Use the routine from C-8 to estimate the Poisson and Gaussian 
conditional probabilities of a major earthquake in the New 
Madrid seismic zone in the next 20 years, assuming that the past 
one occurred in 1812. Assume that major earthquakes have: 

(a) a mean recurrence time of 500 years with standard 
deviation 100 years. 

(b) a mean recurrence time of 750 years with standard 
deviation 250 years. 

(c) a mean recurrence time of 1000 years with standard 
deviation 500 years. 

C-10. Write a subroutine (or set up a spreadsheet) to compute the 
mean and standard deviation of series of numbers. 

C-ll. By combining the results from C-8 and C-10: 

(a) Find the mean and standard deviation of recurrence inter¬ 
vals for the series of Parkfield earthquakes that occurred 
in the years 1857, 1881, 1901, 1922, 1934, and 1966. 
Compute Poisson and Gaussian conditional probabilities 
starting in 1985 for an earthquake in the eight-year inter¬ 
val until 1993. 

(b) Do the same calculation if the 1934 earthquake had 
occurred in 1944, as implicitly assumed when the pre¬ 
diction discussed in Section 1.2.5 was made. Flow do the 
values change and why? 

(c) The awaited earthquake may or may not have occurred 
by the time you do this problem. In either event, assume 
that it has not occurred by 2010, and find the mean and 
standard deviation of the recurrence times from the dates 
in (a), also including the interval 1966-2010. Calculate 
the Poisson and Gaussian conditional probabilities that 
the earthquake will occur in eight years from 2010. 

(d) Do the same assuming the earthquake has not occurred 
by 2020. 

(e) Compare the results of (a), (c), and (d) and explain the 
differences. 








Seismology and Plate Tectonics 


The acceptance of continental drift has transformed the earth sciences from a group of rather unimaginative studies based on pedes¬ 
trian interpretations of natural phenomena into a unified science that holds the promise of great intellectual and practical advances. 

J. Tuzo Wilson, Continents Adrift and Continental Aground , 1976 


5.1 Introduction 

Two of the major advances in the earth sciences since the 1960s 
have been the growth of global seismology and the develop¬ 
ment of our understanding of global plate tectonics. The two 
are closely intertwined because seismological advances pro¬ 
vided some of the crucial data that make plate tectonics the 
conceptual framework used to think about large-scale pro¬ 
cesses in the solid earth. 

The theory of plate tectonics grew out of the earlier theory 
of continental drift, proposed in its modern form by Alfred 
Wegener in 191 5 . The idea that continents drifted apart was an 
old one, rooted in the remarkable fit of the coasts of South 
America and Africa. Still, without compelling evidence for 
motion between continents, the idea that such motions were 
physically impossible prevented most geologists from accept¬ 
ing Wegener’s ideas. By the 1970s the story was very different. 
Geologists accepted continental drift in large part because 
paleomagnetic measurements, based on the geometry and his¬ 
tory of the earth’s magnetic field, showed that continents had in 
fact moved over millions of years. Combination of these obser¬ 
vations with results from seismology and marine geology and 
geophysics led to the realization that all parts of the earth’s 
outer shell, not just the continents, were moving. 

Plate tectonics is conceptually simple: it treats the earth’s 
outer shell as made up of about 15 rigid plates, about 100 km 
thick, which move relative to each other at speeds of a few cm 
per year. 1 The plates are rigid in the sense that little (ideally 
no) deformation occurs within them, so deformation occurs 
at their boundaries, giving rise to earthquakes, mountain 
building, volcanism, and other spectacular phenomena. These 
strong plates form the earth’s lithosphere , and move over the 

1 This is about the speed at which fingernails grow. 


weaker asthenosphere below. The lithosphere and astheno- 
sphere are mechanical units defined by their strength and the 
way they deform. The lithosphere includes both the crust and 
part of the upper mantle. 

Figure 5.1-1 shows the three basic types of plate bound¬ 
aries. Warm mantle material upwells at spreading centers , 
also known as mid-ocean ridges, and then cools. Because the 
strength of rock decreases with temperature (Section 5.7.3), 
the cooling material forms strong plates of new oceanic litho¬ 
sphere. The cooling oceanic lithosphere moves away from the 
ridges, and eventually reaches subduction zones , or trenches, 2 
where it descends in downgoing slabs back into the mantle, re¬ 
heating as it goes. The direction of the relative motion between 
two plates at a point on their common boundary determines 
the nature of the boundary. At spreading centers both plates 
move away from the boundary, whereas at subduction zones 
the subducting plate moves toward the boundary. At the third 
boundary type, transform faults , relative plate motion is paral¬ 
lel to the boundary. 

As discussed in Section 3.8, seismology shows that the 
structure of the mantle and the core varies with depth, due to 
changes in temperature, pressure, mineralogy, and composi¬ 
tion. Plate tectonics describes the behavior of the lithosphere, 
the strong outer shell of the mantle, which is the cold outer 
boundary layer of the thermal convection system involving the 
mantle and the core that removes heat from the earth’s interior. 
Although much remains to be learned about this convective 
system, especially in the lower mantle and the core (Fig. 5.1-2), 
there is general agreement that at shallow depths the warm, 

2 Boundaries are described either as mid-ocean ridges and trenches, emphasizing 
their morphology, or as spreading centers and subduction zones, emphasizing 
the plate motion there. The latter nomenclature is more precise, because there are 
elevated features in the ocean basins that are not spreading ridges, and spreading 
centers like the East African rift exist within continents. 
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Fig. 5.1-1 Plate tectonics at its simplest. 
Oceanic lithosphere is formed at ridges and 
subducted at trenches. At transform faults, 
plate motion is parallel to the boundaries. 
Each boundary type has typical 
earthquakes. 


Fig. 5.1-2 Schematic diagram showing 
ideas about mantle convection. Ridges 
reflect upper mantle upwelling. Slabs 
penetrate into the lower mantle, causing 
heterogeneity there, and in some cases 
descend to the base of the mantle. Mantle 
(hot spot) plumes reflect lower mantle 
upwelling. Many features shown are 
controversial and subject to change without 
notice. (Modified from Stacey, 1992.) 



and hence less dense, material rising below spreading centers 
forms upwelling limbs, whereas the relatively cold, and hence 
dense, subducting slabs form downwelling limbs. Although 
the lithosphere is a very thin layer compared to the rest of the 
mantle (100 km is 1/29 of the mantle’s radius), it is where 
the greatest temperature change occurs, from about 1300° to 
1400°C at a depth of 100 km to about 0°C at the surface. For 
this reason, the lithosphere is called a thermal boundary layer. 
Because of this temperature change, the lithosphere is much 
stronger than the underlying rock, and so is also a mechanical 
boundary layer. This strong boundary layer is thought to be a 
primary reason why plate tectonics is much more complicated 
than expected from simple convection models. Moreover, 
the lithosphere, which contains the crust, is also a chemical 
boundary layer distinct from the remainder of the mantle. Con¬ 
tinental lithosphere is especially distinct: although individual 
plates can contain both oceanic and continental lithosphere, 
the latter is made of less dense rock than the former (recall the 


differences between granitic and basaltic rocks discussed in 
Section 3.2), and so does not subduct. The oceanic lithosphere 
is continuously subducted and reformed at ridges, and so never 
gets older than about 200 Myr. The continental lithosphere, 
however, can be billions of years old. 

Put another way, plate tectonics is the primary surface mani¬ 
festation of the heat engine whose nature and history govern 
the planet’s thermal, mechanical, and chemical evolution. 3 
Earth’s heat engine is characterized by the balance between 
three modes of heat transfer from the interior: the plate tectonic 
cycle involving the cooling of oceanic lithosphere; mantle 
plumes, which are thought to be a secondary feature of mantle 
convection; and heat conduction through continents that are 
not subducted and hence do not participate directly in the 
oceanic plate tectonic cycle. Based on estimates from sea floor 
topography and heat flow, discussed shortly, terrestrial heat 

3 It has been said that heat is the geological lifeblood of planets. 
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loss seems to occur primarily (about 70%) via plate tectonics, 
with about 5% via hot spots (mantle plumes). By contrast, 
Earth’s grossly similar sister planets, Mars and Venus, seem to 
function quite differently, because large-scale plate tectonics 
appears absent, at least at present. 

Plate tectonics is also crucial for the evolution of Earth’s 
ocean and atmosphere, because it involves many of the primary 
means (including volcanism, hydrothermal circulation through 
cooling oceanic lithosphere, and the cycle of uplift and erosion) 
by which the solid earth interacts with the ocean and the atmo¬ 
sphere (Fig. 5.1-3). The chemistry of the oceans and the atmo¬ 
sphere depends in large part on plate tectonic processes, and 
many long-term features of climate are influenced by moun¬ 
tains that are uplifted by plate convergence and the positions of 
continents that control ocean circulation. In fact, the presence 
of plate tectonics may explain how life evolved on earth (at 
mid-ocean ridge hot springs) and be crucial for its survival (the 
atmosphere is maintained by plate boundary volcanism, and 
plate tectonics raises the continents above sea level). 

As a result, plate tectonics is heavily studied by earth scient¬ 
ists. Our goal in this chapter is to introduce some of the ways 
in which seismology contributes to these studies. Some sources 
for more general and more detailed treatments of these topics 
are listed at the end of the chapter. 

Seismology plays several key roles in our studies of plate 
tectonics. The distribution of earthquakes provides strong 
evidence for the idea of essentially rigid plates, with deforma¬ 
tion concentrated on their boundaries. Figure 5.1-4 shows 
maps of global seismicity covering the time period 1964-97. 
Such maps did not become available until the early 1960s, 
when the World Wide Standardized Seismographic Network 
(WWSSN) allowed accurate locations for earthquakes of 
magnitude 5 or greater anywhere in the world. The map shows 
several remarkable patterns. 

The mid-ocean ridge system, where the oceanic lithosphere 
is created, is beautifully outlined by the earthquake locations. 
For example, the Mid-Atlantic ridge and East Pacific rise can be 
followed using epicenters for thousands of kilometers. The loca¬ 
tions of the trenches, where oceanic lithosphere is subducted, 


are even more apparent in the lower panel showing earth¬ 
quakes with focal depths greater than 100 km, because mid¬ 
ocean ridge earthquakes are shallow and thus do not appear. 

It is especially impressive to plot the locations of earthquakes 
on cross-sections across trenches (Fig. 5.1-5), Inclined zones of 
seismicity delineate the subducting oceanic plates, which travel 
time and attenuation studies show to be colder and stronger 
than the surrounding mantle. These zones, identified before 
their plate tectonic significance became clear, are known as 
Wadati-Benioff zones after their discoverers. 4 

The interplate earthquakes both delineate plate boundaries 
and show the motion occurring there. We will see that the 
direction of faulting reflects the spreading at mid-ocean ridges 
and subduction at trenches. The earthquake locations and 
mechanisms also show that plate boundaries in continents are 
often complicated and diffuse, rather than the simple narrow 
boundaries assumed in the rigid plate model that are a good 
approximation to what we see in the oceans. For example, 
seismicity shows that the collision of the Indian and Eurasian 
plates creates a deformation zone which includes the Hima¬ 
layas but extends far Into China. Similarly, the northward 
motion of the Pacific plate with respect to North America 
creates a broad seismic zone, indicating that the plate boundary 
zone spans much of the western USA and Canada. 

In addition, intraplate earthquakes occur within plate 
interiors, far from boundary zones. For example, Fig. 5.1-4 
shows earthquakes in eastern Canada and central Australia. 
Such earthquakes are much rarer than plate boundary zone 
earthquakes, but are common enough to indicate that plate 
interiors are not perfectly rigid. In some cases these earth¬ 
quakes are associated with intraplate volcanism, as in Hawaii. 
Intraplate earthquakes are studied to provide data about where 
and how the plate tectonic model does not fully describe tec¬ 
tonic processes. 

4 Kiyoo Wadati (1902-95) discovered the existence of deep seismicity and its 
geometry under Japan; Hugo Benioff (1899-1968), also known for important 
contributions to seismological instrumentation, discussed the global nature of deep 
earthquakes and their relation to surface features (Fig. 1.1-10). 
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Fig. 5.1-5 Seismicity cross-section perpendicular to the New Hebrides 
trench showing the Wadati-Benioff zone. This dipping plane of 
earthquakes indicates the position of the subducting plate. (Isacks and 
Barazangi, 1977. Island Arcs, Deep Sea Trenches and Back Arc Basins , 
99-114, copyright by the American Geophysical Union.) 

In summary, seismology provides crucial information 
about both plate kinematics , the directions and rates of plate 
motions, and plate dynamics , the forces causing plate motions. 
As we will see, seismicity is one of the major tools used to 
identify and delineate plate boundary zones, and earthquake 
mechanisms are among the primary data used to determine the 
motion within plate boundary zones. The mechanisms also 
provide information about the stresses acting at plate boundar¬ 
ies and within plates, which, together with earthquake depths 
and seismic velocity structure, are important in developing 
ideas about the forces involved and the physical processes by 
which rocks deform and cause earthquakes. Conversely, plate 
motion data are used to draw inferences about the locations 
and times of future earthquakes and their societal risks. Thus it 
is often hard, and sometimes pointless, to decide where seismo¬ 
logy ends and plate tectonics begins, or vice versa. 

5.2 Plate kinematics 

Understanding the distribution and types of earthquakes 
requires an understanding of the geometry of plate motions, or 
plate kinematics. In this section we sketch some basic results, 
of which we assume most readers have some knowledge. As 
full exploration of this topic is beyond our scope, readers are 
encouraged to delve into the suggested literature. 

5.2 .1 Relative plate motions 

A basic principle of plate tectonics is that the relative motion 
between any two plates can be described as a rotation about an 
Euler pole 1 (Fig. 5.2-1). This condition controls the types of 
boundaries and the focal mechanisms of earthquakes resulting 
from relative motions, as discussed later. Specifically, at any 

1 This term comes from Euler’s theorem, which states that the displacement of any 
rigid body (in this case, a plate) with one point (in this case, the center of the earth) 
fixed is a rotation about an axis. 



Fig. 5.2-1 Geometry of plate motions. Linear velocity at point r is given 
by v /V = co- x r. The Euler pole is the intersection of the Euler vector with 
the earth’s surface. Note that west longitudes and south latitudes are 
negative. 

point r along the boundary between plate i and plate /, with 
latitude X and longitude fi, the linear velocity of plate / with 
respect to plate i is 

v /; = ffl /; xr. (1) 

This is the usual formulation for rigid body rotations in 
mechanics, r is the position vector to the point on the bound¬ 
ary, and is the angular velocity vector, or Euler vector . Both 
vectors are defined from an origin at the center of the earth. 

The direction of relative motion at any point on the bound¬ 
ary is a small circle, a parallel of latitude about the Euler pole 
(not a geographic parallel about the North Pole!). For example, 
in Fig. 5.2-2 {top) the pole shown is for the motion of plate 2 
with respect to plate 1. The convention used is that the first 
named plate (; = 2) moves counterclockwise (in a right-handed 
sense) about the pole with respect to the second named plate 
{i =1). The segments of the boundary where relative motion is 
parallel to the boundary are transform faults. Thus transforms 
are small circles about the pole, and earthquakes occurring on 
them should have pure strike-slip mechanisms. Other segments 
have relative motion away from the boundary, and are thus 
spreading centers. Figure 5.2-2 (bottom) shows an alternative 
case. The pole here is for plate 1 ( / = 1) with respect to plate 2 
(i = 2), so plate 1 moves toward some segments of the bound¬ 
ary, which are subduction zones. 

The magnitude, or rate, of relative motion increases with 
distance from the pole because 

l v /,l = l®/ill r l sin r, (2) 

where yis the angle between the Euler pole and the site (corres¬ 
ponding to a colatitude about the pole). All points on a plate 
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Fig. 5.2-2 Relationship of motions on plate boundaries to the Euler 
pole. Relative motions occur along small circles about the Euler pole 
(short dashed lines) at a rate that increases with distance from the pole. 
Note the difference the sense of rotation makes: co /7 is the Euler vector 
corresponding to the rotation of plate / counterclockwise with respect to i. 

boundary have the same angular velocity, but the magnitude of 
the linear velocity varies from zero at the pole to a maximum 
90° away. 

The components of the vectors can be written in Cartesian 
(x, y, z) coordinates (Fig. 5.2-1). The position vector is 

r=(a cos A cos /i, a cos X sin jU, a sin 2), (3) 

where a is the earth’s radius. Similarly, if the Euler pole is at 
latitude 0 and longitude 0, the Euler vector is written (neglect¬ 
ing the ij subscripts for simplicity) as 

£ 0 = (| co | cos 0 cos 0, | co | cos 6 sin 0, \ co | sin 0), (4) 

where the magnitude, |©|, is the scalar angular velocity or 
rotation rate. To find the Cartesian components of the linear 
velocity v, we evaluate the cross product (Eqn 1) using its 
definition (Eqn A.3.28), and find 

V=( V x ,V y ,V z ), 

v =a\ct) \ (cos 0 sin 0 sin X- sin 0 cos X sin jX) 


v y = a | (O | (sin 6 cos X cos fi - cos 0 cos 0 sin X) 

v z = a\co\ cos 0 cos X sin {/i — 0). (5) 

At the point r, the north-south and east-west unit vectors 
can be written in terms of their Cartesian components using 
Eqn A.7.4, 

e NS = (-sin X cos fi, -sin X sin /i, cos A), 

e EW = (-sin/i, cos/i, 0), (6) 

so we find the north-south and east-west components of v by 
taking dot products of its Cartesian components (Eqns 5) with 
the unit vectors (Eqns 6), and obtain 

^ NS -a\m \ cos 6 sin {ji- 0), 

v EW = a\(o\ [sin 0 cos A-cos 0sin Acos {jll- 0)]. (7) 

We can then find the rate and direction of plate motion, 
rate = |v|=> NS ) 2 + (^ EW ) 2 

azimuth = 90°-tan" 1 [(i/ NS )/(y EW )], (8) 

such that azimuth is measured in the usual convention, degrees 
clockwise from North. 

In evaluating these expressions, it is important to be careful 
with dimensions. Although rotation rates are typically reported 
in degrees per million years, they should be converted to 
radians per year. The resulting linear velocity will have the 
same dimensions as Earth’s radius. By serendipity, converting 
radius in km to mm and Myr to years cancel out, so only the 
degrees to radians (x 7 f/ 180°) conversion actually needs to be 
done to obtain a linear velocity in mm/yr. Plate motions are 
often quoted as mm/yr, because a year is a comfortable unit 
of time for humans and 1 mm/yr corresponds to 1 km/Myr, 
making it easy to visualize what seemingly slow plate motion 
accomplishes over geologic time. 

To see how this works, consider Fig. 5.2-3, which shows the 
North America-Pacific boundary zone. The map is drawn in a 
projection about the Euler pole, so the expected relative motion 
is parallel to small circles like the one shown. By analogy to 
Fig. 5.2-2, this geometry predicts NW-SE-oriented spreading 
along ridge segments in the Gulf of California, which are rifting 
Baja California away from the rest of Mexico. Further north, 
the San Andreas fault system is essentially parallel to the 
relative motion, so is largely a transform fault. In Alaska, the 
eastern Aleutian arc is perpendicular to the plate motion, so 
the Pacific plate subducts beneath North America. Thus this 
plate boundary contains ridge, transform, and trench portions, 
depending on the geometry of the boundary. 2 In addition, the 

2 A good way to visualize the plate motion is to photocopy Fig. 5.2-3, cut along 
the boundary of the Pacific plate, and then photocopy the “Pacific” onto another 
piece of paper. Putting the “Pacific” beneath “North America” and rotating around a 
thumbtack through the pole shows the ridge, transform, and trench motions both 
forward and backward in time. 
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Fig. 5.2-3 Geometry and focal mechanisms 
for a portion of the North America-Pacific 
boundary zone that also includes the small 
Juan de Fuca (JF) plate. The map projection 
is about the Pacific-North America Euler 
pole, so the line with dots shows a small 
circle and thus the direction of plate 
motion. This small circle is further from the 
pole than the San Andreas fault, so the rate 
of motion on it is larger. The variation in 
the boundary type along its length from 
extension, to transform, to convergence, is 
shown by the focal mechanisms. The diffuse 
nature of the boundary zone is shown by 
seismicity (small dots), focal mechanisms, 
topography (elevation above 1000 m is 
shaded), and vectors showing the motion of 
GPS and VLBI sites (squares) (Bennett etaL, 
1999) with respect to the stable interior of 
North America. The velocity scale is shown 
by the plate motion arrows; some site 
motion vectors are too small to be seen. 
(Stein and Klosko, 2002. From The 
Encyclopedia of Physical Science and 
Technology , ed. R. A. Meyers, copyright 
2002 by Academic Press, reproduced by 
permission of the publisher.) 


boundary zone contains the small Juan de Fuca plate, which 
subducts beneath the Pacific Northwest at the Cascadia 
subduction zone. 

Equation 8 lets us find how the motion varies. The predicted 
motion of the Pacific plate with respect to the North American 
plate at a point on the San Andreas fault (36°N, 239°E) has 
a rate of 46 mm/yr at an azimuth of N36°W. The predicted 
direction agrees reasonably well with the average trend of 
the San Andreas fault, N41°W. Thus, to first order, the San 
Andreas is a Pacific-North America transform plate boundary 
with right-lateral motion. However, there are some deviations 
from pure transform behavior. As we will see, the rate on the 
San Andreas fault is less than the total plate motion because 
some of the motion occurs elsewhere within the broad plate 
boundary zone. In addition, in some places the San Andreas 
trend differs enough from the plate motion direction that dip- 


slip faulting occurs. Hence we think of the San Andreas as the 
primary feature of the essentially strike-slip portion of the plate 
boundary zone. 

Similarly, at a point on the Aleutian trench near the site 
of the great 1964 Alaska earthquake (Fig. 4.3-15) (62°N, 
212°E), we predict Pacific motion of 53 mm/yr at N14°W with 
respect to North America. This motion is into the trench, which 
is a Pacific-North America subduction zone. It is worth noting 
that for a given convergent relative motion either plate can be 
subducting. However, the relative direction is important, so the 
plates cannot be interchanged: if N14°W were the direction of 
motion of North America with respect to the Pacific, the mo¬ 
tion would be away from the boundary, which would then be 
a spreading center with the same rate. As for the San Andreas, 
the actual boundary zone shown by earthquakes and other 
deformation is wider and more complicated than the ideal. 
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Earthquake focal mechanisms within the boundary zone are 
consistent with the overall plate motions and illustrate some of 
their complexities. In the Gulf of California we see both strike- 
slip faulting along oceanic transforms and normal faulting on 
ridge segments. The San Andreas fault system, composed of the 
main fault and some others, has both pure strike-slip earth¬ 
quakes (Parkfield) and earthquakes with some dip-slip motion 
(Northridge (Section 4.5.3), San Fernando, and Loma Prieta) 
when it deviates from pure transform behavior. The seismicity 
also shows that the plate boundary zone is quite broad. 
Although the San Andreas fault system is the locus of most 
of the plate motion (Fig. 4.5-13) and hence large earthquakes, 
seismicity extends as far eastward as the Rocky Mountains. For 
example, the Landers earthquake shows strike-slip motion east 
of the San Andreas, and the Borah Peak earthquake illustrates 
the extensional faulting that occurs in the Basin and Range. 
These focal mechanisms are consistent with the motions shown 
by space-based geodetic measurements, discussed shortly, and 
with geologic studies. 

5.2.2 Global plate motions 

The relative plate motions show how the plate boundary geo¬ 
metry is evolving and has evolved. The Juan de Fuca plate is 
subducting under North America faster than new lithosphere 
is being added to it by sea floor spreading at its boundary with 
the Pacific plate, so this plate was larger in the past and is 
shrinking. Rotating the Pacific plate backwards with respect 
to North America shows that 10 million years ago the Gulf of 
California had not yet begun to open by sea floor spreading. 
These changes are part of the evolution of the plate boundary 
in western North America, in which the large oceanic Farallon 
plate that used to be between the Pacific and North American 
plates began subducting under North America at about 
40 Ma, 3 leaving the Juan de Fuca plate as a remnant and 
forming the San Andreas fault. 

At this point you may be wondering how Euler poles are 
found. Until recently, this was done by combining three dif¬ 
ferent types of data from different boundaries. The rates of 
spreading are found from sea floor magnetic anomalies, which 
form as the hot rock at ridges cools and acquires magnetization 
parallel to the earth’s magnetic field. Because the history of 
reversals of the earth’s magnetic field is known, the anomalies 
can be dated, so their distance from the ridge where they 
formed shows how fast the sea floor moved away from the 
ridge. The directions of motion are found from the orientations 
of transform faults and the slip vectors of earthquakes on trans¬ 
forms and at subduction zones. Euler vectors are found from 
the relative motion data, using geometrical conditions we have 
discussed. The process is easy to visualize. Because slip vectors 
and transform faults lie on small circles about the pole, the pole 
must lie on a great circle at right angles to them (Fig. 5.2-2). 
Similarly, the rate of plate motion increases with the sine of 


the distance from the pole (Eqn 2). These constraints make it 
possible to locate the poles. Determination of Euler vectors for 
all the plates can thus be treated as an overdetermined least 
squares problem whose solution (Section 7.5) gives a global 
relative plate motion model. Because these models use spread¬ 
ing rates determined from magnetic anomaly data that span 
several million years, they describe plate motions averaged 
over the past few million years. 4 

Table 5.2-1 gives such a model, known as NUVEL-1A, 5 
which specifies the motions of plates (Fig. 5.2-4) with respect 
to North America. The vectors follow the convention that each 
named plate moves counterclockwise relative to North America. 
Although the table lists only Euler vectors with respect to 
North America, the motion of plates with respect to other 
plates is easily found using vector arithmetic. For example, 


so we reverse the plate pair using the negative of the Euler 
vector. The pole for the new plate pair is the antipole, with 
latitude of opposite sign and longitude increased by 180°. The 
magnitude (rotation rate) stays the same. We can also reverse 
the plate pair by keeping the same pole and making the rota¬ 
tion rate negative (clockwise rather than counterclockwise). 
Although we usually use positive rotation rates, negative ones 
sometimes help us visualize the motion. For example, the table 
shows the Pacific-North America pole at about -49°N, 102°E, 
so the North America-Pacific pole is at about 49°N, (102 + 180 
= 282)°E, which is in southeastern Canada. Thus, about this 
pole, North America rotates counterclockwise with respect to 
the Pacific, or the Pacific rotates clockwise with respect to 
North America, as shown in Fig. 5.2-3. 

For other plate pairs we assume that the plates are rigid, so 
all motion occurs at their boundaries. We can then add Euler 
vectors, 

®,* = ®/, + ®a (10) 

because the motion of plate j with respect to plate k equals 
the sum of the motion of plate / with respect to plate i and the 
motion of plate i with respect to plate k. Thus if we start with a 
set of vectors all with respect to one plate, e.g., *, we use 

(11) 

to form any Euler vector needed. These operations are easily 
done using the Cartesian components (Eqn 4), as shown in 
this chapter’s problems. We can also perform the analogous 
operations on linear velocity vectors at a specific site. 

4 The most recent magnetic reversal occurred about 780,000 years ago, so any plate 
model based on paleomagnetic data must average at least over that interval. 

5 NUVEL-1 (Northwestern University VELocity) was developed as a new 
(“nouvelle”) model (DeMets et ai, 1990). The multiyear development prompted 
the suggestion that “OLDVET" might be a better name. Due to changes in the 
paleomagnetic time scale the model was revised to NUVEL-1A (DeMets et al., 1994). 
This change caused a slight difference in the rates of relative motion, but not in the 
poles and hence directions of relative motion. 


3 “Ma” is often used to denote millions of years before the present. 
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Table 5.2-1 Euler vectors with respect to North America (NA). 


Plate 

Pole latitude (°N) 

Longitude (°E) 

| oj \ (°/Myr) 

Pacific (PA) 

-48.709 

101.833 

0.7486 

Africa (AF) 

78.807 

38.279 

0.2380 

Antarctica (AN) 

60.511 

119.619 

0.2540 

Arabia (AR) 

44.132 

25.586 

0.5688 

Australia (AU) 

29.112 

49.006 

0.7579 

Caribbean (CA) 

74.346 

153.892 

0.1031 

Cocos (CO) 

27.883 

-120.679 

1.3572 

Eurasia (EU) 

62.408 

135.831 

0.2137 

India (IN) 

43.281 

29.570 

0.5803 

Nazca (NZ) 

61.544 

-109.781 

0.6362 

South America (SA) 

-16.290 

121.876 

0.1465 

Juan de Fuca (JF) 

-22.417 

67.203 

0.8297 

Philippine (PH) 

-43.986 

-19.814 

0.8389 

Rivera (Rl) 

22.821 

-109.407 

1.8032 

Scotia (SC) 

-43.459 

123.120 

0.0925 

NNR* 

2.429 

93.965 

0.2064 


Source: After DeMets etal. 1994. 

*No net rotation, defined in Section 5.2.4. 



Fig. 5.2-4 Relative plate motions for the NUVEL-1 global plate motion model. Arrow lengths are proportional to the displacement if plates maintain their 
present relative velocity for 25 Myr. Divergence across mid-ocean ridges is shown by diverging arrows. Convergence is shown by single arrows on the 
underthrust plate. Plate boundaries are shown as diffuse zones implied by seismicity, topography, or other evidence of faulting. Fine stipple shows mainly 
subaerial regions where the deformation has been inferred from seismicity, topography, other evidence of faulting, or some combination of these. Medium 
stipple shows mainly submarine regions where the nonclosure of plate circuits indicates measurable deformation; in most cases these zones are also 
marked by earthquakes. Coarse stipple shows mainly submarine regions where the deformation is inferred mostly from the presence of earthquakes. The 
geometry of these zones, and in some cases their existence, is under investigation. (Gordon and Stein, 1992. Science, 256 , 333-42, copyright 1992 
American Association for the Advancement of Science.) 

Such vector addition is important because we only have vectors, only the direction of motion is directly known at 

certain types of data for individual boundaries (Fig. 5.2-5). subduction zones. As a result, convergence rates at subduction 

Although spreading centers provide rates from the magnetic zones are estimated by global closure, combining data from all 

anomalies and azimuths from both transform faults and slip plate boundaries (Section 7.5). Thus the predicted rate at which 
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Fig. 5.2-5 Global plate circuit geometry for the NUVEL-1 plate motion 
model. Relative motion data are used on the boundaries indicated. 
(DeMets etaL, 1990. Geophys.J. Int., 101,425-78.) 

the Cocos plate subducts beneath North America, causing 
large earthquakes in Mexico, depends on the measured rates of 
Cocos-Pacific spreading on the East Pacific rise and Pacific- 
North America spreading in the Gulf of California. In some 
cases, such as relative motion between North and South Amer¬ 
ica, no direct data were used because the boundary location and 
geometry are unclear, so the relative motion is inferred entirely 
from closure. Not surprisingly, the motions of plate pairs based 
on both rate and azimuth data appear to be better known. 

Figure 5.2-4 shows the predicted relative motions at plate 
boundaries around the world. As shown for the Pacific-North 
America boundary in Fig. 5.2-3 and discussed in general terms 
in later sections, the predicted motions correspond to the earth¬ 
quake mechanisms. Moreover, we can use the plate motions to 
make inferences about future earthquakes. For example, even 
though we do not have seismological observations of large 
earthquakes along the boundary between the Juan de Fuca 
and North American plates, the plate motions predict that 
such earthquakes could result from the subduction of the Juan 
de Fuca plate beneath North America. Evidence for this sub¬ 
duction is given by the presence of the Cascade volcanoes (such 
as Mount Saint Plelens and Mount Rainer) and paleoseismic 
records (Section 1.2.5) that are interpreted as evidence of large 
past earthquakes. 

Figure 5.2-4 also illustrates that boundaries between plates 
are often diffuse. Seismicity, active faulting, and elevated topo¬ 
graphy often indicate a broad zone of deformation between 
plate interiors. This effect is evident in continental lithosphere, 
such as the India-Eurasia collision zone in Asia or the Pacific- 
North America boundary zone in the western USA, but can 
also sometimes be seen in oceanic lithosphere, as in the Central 
Indian Ocean. Plate boundary zones cover about 15% of the 
earth’s surface, and about 40% of the earth’s population lives 
within them. 


Earthquakes are among the best tools for investigating plate 
boundary zones and other deviations from plate rigidity. They 
provide one of the best indicators of the location of boundary 
zones, so new earthquakes often change our views. We also 
use plate motion data, many of which are earthquake slip vec¬ 
tors. For example, Fig. 5.2-4 shows zones of seismicity in the 
Central Indian Ocean (Section 5.5.2) as boundaries between 
distinct Indian and Australian plates, rather than as within a 
single Indo-Australian plate, because spreading rates along the 
Central Indian Ocean ridge are better fit by a two-plate model. 
A similar argument justifies the assumption of a small Rivera 
plate distinct from the Cocos plate. Another approach is to use 
the global plate circuit closures (Fig. 5.2-5). Recall that forming 
a Euler vector from two others (Eqn 10) assumes that all three 
plates are rigid. Hence this assumption can be used to test for 
deviations from rigidity. To do this, we form a best-fitting vec¬ 
tor for a plate pair, using only data from that pair of plates’ 
boundary, and a closure fitting vector from data elsewhere in 
the world. If the plates were rigid, the two vectors would be 
the same. However, a significant difference between the two 
indicates a deviation from rigidity, or another problem with 
the plate motion model. For example, such analysis shows 
systematic deviations along some subduction zones, suggesting 
that the slip vectors of the trench earthquakes do not exactly 
reflect plate motions because a sliver of forearc material in the 
overriding plate moves separately from the remainder of the 
overriding plate (Section 5.4.3). 

A variant of this approach is to examine the Euler vectors for 
three plates that meet at a triple junction , compute best-fitting 
Euler vectors for each of the three plate pairs, and sum them. 
For rigid plates, Eqn 10 shows that the sum should be zero. 
However, when this was done for the junction in the Central 
Indian Ocean, assuming that it was where the African, Indo- 
Australian, and Antarctic plates met, the Euler vector sum dif¬ 
fered significantly from zero, indicating deviations from plate 
rigidity. As plate motion data improve, it seems that what 
was treated as a three-plate system may include as many as 
six resolvable plates (Antarctica, distinct Nubia (West Africa) 
and Somalia (East Africa), India, Australia, and Capricorn 
(between India and Arabia)). Hence models of plate 
boundaries and motions improve with time (Fig. 1.1-9). For 
example, although the model in Fig. 5.2-4 has a single African 
plate, recent models seek to resolve the motion between Nubia 
and Somalia (Fig. 5.6-2). 

5.2.3 Space-based geodesy 

New plate motion data have become available in recent years 
due to the rapidly evolving techniques of space-based geodesy. 
Using space-based measurements to determine plate motions 
was suggested by Alfred Wegener when he proposed the theory 
of continental drift in 1915. Wegener realized that proving 
continents moved apart was a formidable challenge. Although 
geodesy — the science of measuring the shape of, and distances 
on, the earth — was well established, standard surveying 
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Fig. 5.2-6 Comparison of rates determined by space geodesy with those 
predicted by the NUVEL-1 global plate motion model. The space geodetic 
rates are determined from sites located away from plate boundaries to 
reduce the effects of deformation near the boundaries. The slope of the 
line is 0.94, indicating that plate motions over a decade are very similar to 
those predicted by a model averaging over 3 million years. (Robbins etal., 
1993. Contributions of Space Geodesy to Geodynamics , 21-36, copyright 
by the American Geophysical Union.) 

methods offered no hope of measuring slow motions between 
continents far apart. Wegener thus decided to measure the dis¬ 
tance between continents using astronomical observations. 6 
However, because measuring continental drift called for meas¬ 
urement accuracies far greater than ever before to show small 
changes in positions over a few years, Wegener’s attempts 
failed, and the idea of continental drift was largely rejected. 

By the 1970s the story was very different. Geologists ac¬ 
cepted continental drift, in large part because paleomagnetic 
measurements showed that continents had in fact moved over 
millions of years. It thus seemed natural to see if modern 
space-based technology could accomplish Wegener’s dream of 
measuring continental motions over a few years. Three basic 
approaches were attempted. Each faced formidable technical 
challenges — and all succeeded. Hence, using the techniques 
discussed in Section 4.5.1, plate motions can now measured to 
a precision of a few mm/yr or better, using a few years of data 
from systems including Very Long Baseline Interferometry 
(VLBI), Satellite Laser Ranging (SLR), and the Global Position¬ 
ing System (GPS). 

Space geodesy measures both the rate and the azimuth of the 
motions between sites, and can thus be used to compute rela- 

6 Using an extraterrestrial reference has a long history; in about 230 bc Eratosthenes 
found the Earth’s size from observations of the sun’s position at different sites, and 
navigators have found their positions by observing the sun and stars. 


tive plate motions. One of the most important results of space 
geodesy for seismology is that plate motions have remained 
generally steady over the past few million years. This is shown 
by the striking agreement between motions measured over a 
few years by space geodesy and the predictions of global plate 
motion models that average over the past three million years 
(Fig. 5.2-6). The general agreement is consistent with the idea 
that although motion at plate boundaries can be episodic, as 
in large earthquakes, the viscous asthenosphere damps out 
the transient motions (much like the damping element in a 
seismometer, Section 6.6) and causes steady motion between 
plate interiors. This steadiness implies that plate motion 
models can be used for comparison with earthquake data. 

Space geodesy surmounts a major difficulty faced by models 
like NUVEL-1A: namely, that the data used (spreading rates, 
transform azimuths, and slip vectors) are at plate boundaries, 
so the model provides only the net motion across a boundary. 
By contrast, space geodesy can also measure the motion of sites 
within plate boundary zones. For example, Fig. 5.2-3 shows 
the motions of GPS and VLBI sites within the North America- 
Pacific boundary zone. Sites in eastern North America move 
so slowly — less than 2 mm/yr — with respect to each other that 
their motion vectors cannot be seen on this scale. These sites 
thus define a rigid reference frame for the stable interior of the 
North American plate. Sites west of the San Andreas fault move 
at essentially the rate and direction predicted for the Pacific 
plate by the global plate motion model. The site vectors show 
that most of the plate motion occurs along the San Andreas 
fault system, but significant motions occur for some distance 
eastward. The geodetic motions are consistent with the focal 
mechanisms and geological data. Thus, as discussed further in 
Section 5.6, the different data types are used together to study 
how the seismic and aseismic portions of the deformation vary 
in space and time in the diffuse deformation zones that charac¬ 
terize many plate boundaries. This is done both on large scales, 
as shown here, and for studies of smaller areas and individual 
earthquakes (Section 4.5). 

Space geodesy is also used to study the relatively rare, but some¬ 
times large, earthquakes within plates. Global plate motion 
models give no idea where or how often intraplate earthquakes 
should occur, beyond the trivial prediction that they should not 
occur because there is no deformation within ideal rigid plates. 
Space geodesy is being combined with earthquake locations, 
focal mechanisms, and other geological and geophysical data 
to investigate the motions and stresses within plates and how 
they give rise to intraplate earthquakes (Section 5.6.3). 

5.2,4 Absolute plate motions 

So far, we have discussed the relative motions between plates, 
which have traditionally been of greatest interest to seismolog¬ 
ists because most earthquakes reflect these motions. However, 
in some applications it is important to consider absolute plate 
motions, those with respect to the deep mantle. 

In general, both plates and plate boundaries move with 
respect to the deep mantle. To see this, assume that the African 
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Fig. 5.2-7 Top: Illustration of the formation of a volcanic island chain by 
plate motion over a fixed hot spot. Bottom : Ages, in millions of years, of 
volcanoes in the Hawaiian-Emperor chain. 

plate were not moving with respect to the deep mantle. In this 
case, as lithosphere was added to the plate by sea floor spread¬ 
ing at the Mid-Atlantic ridge (Fig. 5.2-4), both the ridge and the 
South American plate would move westward with respect to 
the mantle. Conversely, as the African plate lost area by sub- 
duction beneath the Eurasian plate in the Mediterranean, the 
trench would “roll backward,” causing both it and Eurasia to 
move southward relative to the mantle. Such motions can have 
important consequences for processes at plate boundaries (e.g. 
Fig. 5.3-10). 

Absolute plate motions cannot be measured directly. Flence 
we infer these motions in two ways. One uses the hot spot 
hypothesis, in which certain linear volcanic trends result from 
the motion of a plate over a hot spot, or fixed source of volcan- 
ism, which causes melting in the overriding plate (Fig. 5.2-7). If 


the overriding plate is oceanic, its motion causes a progression 
from active volcanism that builds the islands, to older islands, 
to underwater seamounts as the sea floor moves away from 
the hot spot, cools, and subsides. This process leaves a broad, 
shallow, topographic swell around the hot spot and a charac¬ 
teristic volcanic age progression away from it, as shown for the 
Flawaiian-Emperor seamount chain. The ages of volcanism 
range from present, on the currently active island of Hawaii, to 
a few million years on the other Hawaiian islands, 7 to about 28 
Ma at Midway island, and about 70 Ma where the seamount 
chain vanishes into the Aleutian trench. Thus the direction and 
age of the volcanic chain give the motion of the plate with 
respect to the hot spot. For example, the bend in the Hawaiian- 
Emperor seamount chain has been interpreted as indicating 
that the Pacific plate changed direction about 40 million years 
ago. Hence using hot spot tracks beneath different plates, and 
assuming that the hot spots are fixed with respect to the deep 
mantle (or move relative to each other more slowly than 
plates), yields a hot spot reference frame. 

It is often further assumed that hot spots result from plumes 
of hot material rising from great depth, perhaps even the core¬ 
mantle boundary (Fig. 5.1-2). The concepts of hot spots and 
plumes are attractive and widely used, but the relation between 
the persistent volcanism and possible deep mantle plumes re¬ 
mains a subject of active investigation because there are many 
deviations from what would be expected. Some hot spots 
move significantly, some chains show no clear age progression, 
evidence for plate motion changes associated with bends like 
that in Fig. 5.2-7 is weak, and oceanic heat flow data show little 
or no thermal anomalies at the swells. Seismological studies 
find low-velocity anomalies, but assessing their depth extent 
and relation to possible plumes is challenging. However, the 
hot spot reference frame is similar to one obtained by assuming 
there is no net rotation (NNR) of the lithosphere as a whole, 
and hence that the sum of the absolute motion of all plates 
weighted by their area is zero. Thus despite unresolved ques¬ 
tions about the nature and existence of hot spots and plumes, 
NNR reference frames are often used to infer absolute motions. 

To compute absolute motions, we recognize that motions 
in an absolute reference frame correspond to adding a rotation 
to all the plates. Thus we use the Euler vector formulation and 
treat the absolute reference frame as mathematically equival¬ 
ent to another plate. We define O- as the Euler vector of plate i 
in an absolute reference frame. For example, Table 5.2-1 gives 
the NNR Euler vector relative to the North American plate 
(Mnnr-na)’ so its negative (0) NA _ NNR ) is t]ie absolute Euler 
vector Q na for North America in the NNR reference frame. 
The linear velocity at a point r Is found by analogy to Eqn 1: 

v- = Q-xr. (12) 

Thus we find the motion of North America with respect to 
the hot spot thought to be producing the volcanism and 
earthquakes in Yellowstone National Park (44°, -110°) to be 

7 This age progression was recognized by native Hawaiians, who attributed it to the 
order in which the volcano goddess Pele plucked the islands from the sea. 
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Fig. 5.2-8 Comparison of the predicted absolute motion of North 
America to the Snake River Plain basalts, which are thought to be the 
track of a hot spot now producing volcanism in Yellowstone National 
Park. (After Smith and Braile, 1994./. Volcan. Geotherm. Res., 61 , 
121-87, with permission from Elsevier Science.) 


O Normal 
fault 

Fracture zone 


No seismicity 


* 

& 

Strike-slip fault 
(right-lateral) 

Fig. 5.3-1 Possible tectonic settings of earthquakes at an oceanic 
spreading center. Most events occur on the active segment of the 
transform and have strike-slip mechanisms consistent with transform 
faulting. On a slow-spreading ridge, like the Mid-Atlantic, normal fault 
earthquakes also occur. 


<_Normal 

fault 

Ridge 


Strike-slip fault 
(left-lateral) 


Ridge 


V 


No seismicity 


Transform 


O 


Transfc 


5.3 Spreading centers 


18 mm/yr directed N239°E. This motion is along the trend 
connecting the present volcanism in Yellowstone to the 
Snake River Plain basalts (Fig. 5.2-8), which are thought to be 
its track, a continental analogy to the Hawaiian-Emperor 
seamount chain. 

Relative and absolute Euler vectors are simply related because 

6) tj =a-Q p (13) 

the relative Euler vector for two plates, is the difference 
between their absolute Euler vectors. Thus, if we know one 
plate’s absolute motion, we can find all the others from the 
relative motions. For example, the absolute motion of the 
Pacific plate can be found from Table 5.2-1, which gives its 
vector relative to North America, using 

&‘PA = C0 PA-NA + ®‘NA- ( 14 ) 

Absolute motions are important in several seismological 
applications. Seismology is used to study hot spots and their 
effects, including the resulting intraplate earthquakes like 
those associated with the volcanism in Hawaii. For example, 
Fig. 2.8-5 illustrated the use of surface wave dispersion to study 
the velocity structure under the Walvis ridge, which is thought 
to be the track produced by a hot spot under the Mid-Atlantic 
ridge. A second application involves seismic anisotropy in the 
mantle (Section 3.6), which is thought to reflect flow of olivine- 
rich material in a direction that is often consistent with the pre¬ 
dicted absolute plate motions. Thus seismic anisotropy, seismic 
velocities, and absolute motions are being combined to model 
mantle flow. 


Because the lithosphere forms at spreading centers, we begin 
with an overview of such systems and the earthquakes within 
them. We will see that seismological observations both de¬ 
monstrate and reflect the basic kinematic model for ridges 
and transforms. Moreover, they provide key evidence for the 
thermal-mechanical processes that control the formation and 
evolution of the oceanic lithosphere. 

5.3.1 Geometry of ridges and transforms 

Mid-ocean ridges are marked by earthquakes, which provide 
important information about the sea floor spreading process. 
Figure 5.3-1 is a schematic diagram of a portion of a spreading 
ridge offset by transform faults. Because new lithosphere forms 
at ridges and then moves away, transform faults are segments 
of the boundaries between plates, across which lithosphere 
moves in opposite directions. A given pair of plates can have 
either right- or left-lateral motion, depending on the direction 
in which a transform offsets the ridge; both reflect the same 
direction of relative plate motion. This motion across the 
transform is not what produced the offset of the ridge crest. In 
fact, in the usual situation such that spreading is approxim¬ 
ately symmetric (equal rates on either side), the length of the 
transform will not change with time. This is a very different 
geometry from a transcurrent fault, where the offset between 
ridge segments is produced by motion on the fault and in¬ 
creases with time. 

The focal mechanisms illustrate these ideas. Figure 5.3-2 
{top) shows a portion of the Mid-Atlantic ridge composed of 
north-south-trending ridge segments that are offset by trans¬ 
form faults such as the Verna transform that trend approxim- 
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Fig. 5.3-2 Maps contrasting faulting on slow- and fast-spreading 
centers. Top: The slow Mid-Atlantic ridge has earthquakes on both the 
active transform and the ridge segments. Strike-slip faulting on a plane 
parallel to the transform azimuth is characteristic. On the ridge segments, 
normal faulting with nodal planes parallel to the ridge trend is seen. 

Bottom : The fast East Pacific rise has only strike-slip earthquakes on the 
transforms. (Stein and Woods, 1989.) 

ately east-west. Both the ridge crest and the transforms are 
seismically active. The mechanisms show that the relative 
motion along the transform is right-lateral. Sea floor spread¬ 
ing must be occurring on the ridge segments to produce the 
observed relative motion. For this reason, earthquakes occur 
almost exclusively on the active segment of the transform fault 
between the two ridge segments, although an inactive exten¬ 
sion known as a fracture zone extends to either side. Although 
no relative plate motion occurs on the fracture zone, 1 it is 
often marked by a topographic feature due to the contrast 
in lithospheric ages across it. 

1 Unfortunately, some transform faults named before this distinction became clear 
are known as “fracture zones” along their entire length. 


West East 



Fig. 5.3-3 Cross-section through the Mid-Atlantic ridge. The fault plane 
inferred from the focal mechanisms of large earthquakes is consistent with 
the locations of microearthquakes (dots) determined using ocean bottom 
seismometers. Dashed lines show P-wave velocity structure. (Toomey 
etai, 1988./. Geophys. Res., 93, 9093-112, copyright by the American 
Geophysical Union.) 

Earthquakes also occur on the spreading segments. Their 
focal mechanisms show normal faulting, with nodal planes 
trending approximately along the ridge axis. These normal 
fault earthquakes are thought to be associated with the forma¬ 
tion of the axial valley. For example, Fig. 5.3-3 shows a cross- 
section through the Mid-Atlantic ridge. The fault planes 
inferred from teleseismic focal mechanisms and the locations 
of microearthquakes determined using ocean bottom seismo¬ 
meters are consistent with normal faulting along the east side of 
the valley. Slip on this fault over 10,000 years would be enough 
to produce the observed geometry, including the eastward tilt 
of the valley floor. 

The seismicity differs along the East Pacific rise. Here (Fig. 
5.3-2, bottom) earthquakes occur on the transform faults with 
the expected strike-slip mechanisms, but few earthquakes occur 
on the ridge crest. This is probably because the East Pacific rise 
has an axial high, rather than the axial valley that occurs at 
the Mid-Atlantic ridge. 2 This difference appears to reflect the 
spreading rates: ridges spreading at less than about 60 mm/yr 
usually have axial valleys, whereas faster-spreading ridges have 
axial highs and thus do not have ridge crest normal faulting. 

These examples show the spreading process at its simplest, 
but there can be complexities. Spreading can be asymmetric 
(one flank faster than the other) or oblique, such that the 
spreading is not perpendicular to the ridge axis. In addition, the 
geometry of a ridge system can change with time, as discussed 
in Section 5.3.3. 

5.3.2 Evolution of the oceanic lithosphere 

To understand the difference between fast- and slow-spreading 
ridges, and the nature of the earthquakes associated with 
them, it is important to understand the evolution of the oceanic 

2 This is often shown incorrectly on older maps. 
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Fig. 5.3-4 Model for the cooling of an oceanic plate as it moves away from the ridge axis {left). Because a column moves away from the ridge faster than 
heat is conducted in the horizontal direction {right), the cooling in the vertical direction can be treated as a one-dimensional problem. (After Turcotte 
and Schubert, 1982.) 


lithosphere. This process can be described using a simple, but 
powerful, model for the formation of the lithosphere by hot 
material at the ridge, which cools as the plate moves away. 

In this model, material at the ridge at a mantle temperature 
T m (1300-1400 °C) is brought to the ocean floor, which has a 
temperature T s . The material then moves away at a velocity v 9 
while its upper surface remains at T s (Fig. 5.3-4). Because the 
plate moves away from the ridge faster than heat is conducted 
horizontally, we can consider only vertical heat conduction. 
Mathematically, this is the same as the cooling of a halfspace 
originally at temperature T = T m , whose surface is suddenly 
cooled to T s at time t = 0. 

The temperature as a function of depth and time is given 
by the one-dimensional heat flow equation, which relates the 
temperature change with time in a piece of material to the rate 
at which heat is conducted out of it, 


3T(z, t) _ k d 2 T{z , t) _ d 2 T{z , t) 
dt ~~^C p dz 2 dz 2 


k, known as the thermal diffusivity , is a property of the 
material that measures the rate at which heat is conducted. It 
has units of distance squared divided by time, and is defined as 
K= k/pC p , where k is the thermal conductivity, p is the density, 
and C p is the specific heat at constant pressure. 

The well known solution to Eqn 1 is 


T{z,t) = T s + (T m -T s ) erf 



( 2 ) 


where 


erf (s) = 


>7T 


da 


(3) 


is known as the error function. Figure 5.3-5 (right) shows how 
this function varies between erf (0) = 0 and erf (3) ~ 1. Thus 
cooling starts at the surface and deepens with time (Fig. 5.3-5, 
left). 

Assuming that any column of oceanic lithosphere cools this 
way, and that the sea floor temperature is T s = 0 °C, then 
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Fig. 5.3-5 Left: Cooling of a halfspace as described by the one¬ 
dimensional heat flow equation. The surface is cooled at time zero, and 
then the interior cools with time. Right: The error function, which 
controls the cooling solution shown. 




T(z, t) = T m erf 



(4) 


gives the temperature at a depth z for material of age t. The 
lithosphere moves away from the ridge at half the total spread¬ 
ing rate, so the age of the lithosphere is t = x!v, its distance from 
the ridge divided by the half-spreading rate v. Thus the tem¬ 
perature (Eqn 4) as a function of distance and depth is 


T(x, z) = T m erf 


( Z i 


yl^Kx/v J 


(5) 


It is useful to think of isotherms , lines of constant temperature, 
in the plate. An isotherm is a curve on which the argument of 
the error function is constant, 



= Ic^Kt, 


( 6 ) 


so that the depth to a given temperature increases as the square 
root of the lithospheric age. 

This is an example of a general feature of heat conduction 
problems: setting c = 1 and examining Fig. 5.3-5 for erf (1) 
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Fig. 5.3-7 Models and data for thermal 
evolution of the oceanic lithosphere. 

Left : Isotherms for thermal models. The 
lithosphere continues cooling for all ages 
in a halfspace model, but equilibrates for 
~70 Ma lithosphere in a plate model with 
a 95 km-thick thermal lithosphere. The 
plate model shown has a higher basal 
temperature than the halfspace model. 
Right : Comparison of thermal model 
predictions to different data. All show a 
lithospheric cooling signal, and are better 
(but far from perfectly) fit by the predictions 
of a plate model (solid lines) than by 
those of a halfspace model (dashed lines). 
(Richardson etal,, 1995. Geophys. Res. 
Lett., 22, 1913-16, copyright by the 
American Geophysical Union.) 


with the square root of age. Approximating the gradient at the 
surface by the average gradient through the lithosphere, 
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(16) 


predicts that the heat flow decreases as the square root of age. 
The same result can be obtained by differentiation of the tem¬ 
perature structure (Eqn 4) using 


— erf(s) = A JL 
dz dz a/ 7r 
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(18) 


This model, which predicts that lithospheric thickness, heat 
flow, and ocean depth vary as the square root of age for all ages 
is called a halfspace model (Fig. 5.3-7, upper left). In it, the 
lithosphere is the upper layer of a halfspace that continues 
cooling for all time. (In reality, oceanic lithosphere never gets 
older than 200 million years old because it gets subducted.) 
The model does a good job of describing the average variation 
in ocean depth and heat flow with lithospheric age. 

However, because ocean depth seems to “flatten” at about 
70 Myr, we often use a modification called a plate model 
(Fig. 5.3-7, lower left), which assumes that the lithosphere 
evolves toward a finite plate thickness L with a fixed basal tem¬ 
perature T m . In this model, 


T(x,z)=T n 
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where = 2/(nn), p n = (R 2 + n 2 7i 2 ) yz -R,R = vL/(2 k). The con¬ 
stant R, known as the thermal Reynolds number, relates the 
rates at which heat is transported horizontally by plate motion 
and conducted vertically. In this model isotherms initially 
deepen as the square root of age, but eventually level out. The 
flattening reflects the fact that heat is being added from below, 
which the model approximates by having old lithosphere reach 
a steady-state thermal structure that is simply a linear geotherm 
(Fig. 5.3-8, top). As a result, the predicted sea floor depth and 
heat flow also behave for young ages like in the halfspace 
model, but evolve asymptotically toward constant values for 
old ages. Both have simple interpretations: the heat flow is pro¬ 
portional to the geotherm, and thus T m /L , whereas the depth is 
proportional to the thermal subsidence and hence heat lost 
since the plate formed at the ridge, and thus the product T m L. 
The model parameters can be estimated by an inverse problem, 
finding those that best fit a set of depth and heat flow data 
versus age (Fig. 5.3-8, bottom). 

Comparison with data shows that the plate thermal model 
is a good, but not perfect, fit to the average data because pro¬ 
cesses other than this simple cooling are also occurring. For 
example, ocean depth is also affected by uplift associated with 
hot spots (Section 5.2.4). Water flow in the crust transports 
some of the heat for ages less than about 50 Ma, making the 
observed heat flow lower than the model’s predictions, which 
assume that all heat is transferred by conduction. Some topo¬ 
graphic effects, including the spectacular volcanic oceanic 
plateaus, result from crustal thickness variations. Because these 
and other effects vary from place to place, the data vary about 
their average values for a given age. 
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Fig. 5.3-8 Top: Asymptotic thermal structure for old lithosphere in 
a plate model. The sea floor subsidence from the ridge, and thus ocean 
depth, is proportional to the shaded area between the geotherm and 
T = T m , whereas heat flow is proportional to the geotherm. A schematic 
adiabatic temperature gradient (Section 5.4.1) is shown beneath the plate. 
(Stein and Stein, 1992. Reproduced with permission from Nature .) 

Bottom: Fitting process used for thermal model parameters. The misfit to a 
set of depth and heat flow data has a minimum at the point labeled GDH1, 
a plate thermal thickness of 95 ± 15 km and basal temperature of 1450 ± 
250°C. (Stein and Stein, 1996. Subduction , 1-17, copyright by the 
American Geophysical Union.) 

We can view ocean depth, heat flow, and several other 
properties of the oceanic lithosphere as observable measures 
of the temperature in the cooling lithosphere. Because the 
observables depend on different combinations of parameters 
(Table 5.3-1), they can be used together to constrain individual 
parameters (a halfspace model corresponds to an infinitely 
thick plate). The depth depends on the integral of the temper¬ 
ature (Eqn 11), whereas the heat flow depends on its derivative 
at the sea floor (Eqn 15). Similarly, the slope of the geoid, a 
function of the gravity field depending on a weighted integral of 
the density, also varies with age in general agreement with the 
plate model’s prediction (Fig. 5.3-7). 

In addition, the elastic thickness of the lithosphere in¬ 
ferred from the deflection caused by loads such as seamounts 
(Fig. 5.3-9a), the maximum depth of intraplate earthquakes 
within the oceanic lithosphere (Fig. 5.3-9b), and the depth to 


Table 5.3-1 Constraints on thermal models T(z, t ). 


Observable 

Proportional to 

Reflects 

Young ocean depth 

T{z, t)dz 

k m aT m 

Old ocean depth 

T(z , t)dz 

«T m L 

Old ocean heat flow 

dT(z, t ) 

f Z= ° 

kTJL 

Geoid slope 

4 zT(z,t)dz 

Bt j 

kaT m exp {-kt/L 2 ) 

Source: Stein and Stein 

(1996). 
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Fig. 5.3-9 Comparison of isotherms as functions of age for a plate model 
to three datasets whose variation with age is consistent with cooling of the 
lithosphere. The effective elastic thickness (a), deepest intraplate seismicity 
(b), and depth to the low-velocity zone, shown by velocity profiles at 
different ages (c), all increase with age. (After Stein and Stein, 1992. 
Reproduced with permission from Nature.) 
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the iow-velocity zone determined from surface wave dispersion 
{Figs. 5.3-9c and 2.8-7), all increase with age. Hence the 
cooling of oceanic lithosphere causes the expected increase 
in strength and seismic velocity. Moreover, as discussed in 
Section 5.5, the resulting density increase is thought to provide 
a major force driving plate motions. 


Because various properties vary with age, the oceanic litho¬ 
sphere can be defined in various ways, so terms like “seismic 
ithosphere, “elastic lithosphere,” and “thermal lithosphere” 
are often used. Interestingly, these thicknesses differ. It looks as 
if the deepest earthquakes are bounded by about 600-800 °C, 
such that hotter material cannot support seismic failure. The 




Fig. 5.3-10 Top : Geological interpretation 
of a multichannel seismic velocity study on 
the East Pacific rise. A low-velocity region 
under the axis is interpreted as a hot region 
of melting, capped by a magma lens. Dashed 
lines are possible paths of water circulation. 
{Vera etal., 1990./. Geophys. Res., 95, 
15,529-56, copyright by the American 
Geophysical Union.) Bottom.'. Schematic 
cross-section across the East Pacific rise. The 
broad region of low velocities is interpreted 
as the primary melting region. Small ellipses 
are directions of preferred olivine alignment 
inferred from anisotropy. Lines with arrows 
indicate inferred mantle flow, causing the 
distortion shown of an initially vertical line. 
Absolute velocities of the two plates (Pacific 
on left, Nazca on right) are given by small 
horizontal arrows. (Forsyth etal, 1998. 
Science, 280, 1215—18, copyright 1998 
American Association for the Advancement 
of Science.) 
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Fig. 5.3-11 Thermal and petrological model for the difference between fast-spreading (left) and slow-spreading (right) ridges. (Sleep and Rosendahl, 
1979. J. Geophys. Res., 84,6 831-9, copyright by the American Geophysical Union.) 


elastic thickness corresponds approximately to the 400 °C 
isotherm, whereas the low-velocity zone begins approximately 
below the 1000 °C isotherm (Fig. 5.3-9c). These differences, 
discussed in Section 5.7, likely result from rock being stronger 
for more rapid deformation. All of these thicknesses, however, 
only approximate what we would like to know but cannot 
directly measure: the depth of the base of the moving plate, 
which is likely to be a gradational rather than a distinct 
boundary. 

5. 3.3 Ridge and transform earthquakes and processes 

Seismology makes important contributions to understanding 
the properties and behavior of spreading centers. Ocean 
bottom seismometers yield locations of microearthquakes and 
data for travel time and waveform studies. Larger earthquakes 
are also studied using teleseismic body and surface waves. The 
seismological results are being integrated with marine geo¬ 
physical and petrological data to develop better models. For 
example, Fig. 5.3-10 (top) shows a geological interpretation of 
a multichannel seismic study (Section 3.3) that used air gun and 
explosive sources to image velocity structure under the East 
Pacific rise to a depth of about 10 km. A low-velocity region 
under the axis is interpreted as a hot melting region capped by a 
magma lens. Other studies using ocean bottom seismometers 
and distant earthquake sources map the structure to greater 
depth, including inferring flow directions under the ridge axis 
using anisotropy (Fig. 5.3-10, bottom). Such studies are find¬ 
ing interesting features of the spreading process. For example, 
the broad region of low velocity presumed to be the primary 
melting area extends further west than east of the axis. This 
asymmetry may occur because the westward absolute motion 
of the Pacific plate is much faster than the eastward absolute 
motion of the Nazca plate, causing the ridge to migrate west¬ 
ward relative to the deep mantle. Thus the spreading process, 
which depends on the relative plate motion (spreading rate), 
also seems affected by the absolute motion. 

Some effects of the spreading rate are illustrated by a model 


shown in Fig. 5.3-11. At a given distance from the ridge, faster 
spreading produces younger lithosphere and isotherms closer 
to the surface than does slow spreading. If the region beneath 
the 1185 °C isotherm and above the Moho depth of 5 km is 
considered to be a magma chamber, a fast ridge has a larger 
magma chamber. Hence crust moving away from a fast¬ 
spreading ridge is more easily replaced than that moving away 
from a slow ridge. Thus, in contrast to the axial valley and 
normal faulting earthquakes on a slow ridge, a fast ridge has an 
axial high and an absence of earthquakes. Similarly, both the 
depths and the maximum seismic moments 5 of ridge crest 
normal faulting earthquakes decrease with spreading rate 
(Fig. 5.3-12). These observations are consistent with the fault 
area decreasing on faster-spreading and hotter ridges, because 
faulting requires that rock be below a limiting temperature, 
above which it flows (Section 5.7). The idea that the faulting 
depends on temperature is also implied by the increase in the 
maximum depth of oceanic intraplate earthquakes with age 
(Fig. 5.3-9b). 

Transform fault earthquakes also depend on thermal struc¬ 
ture. The temperatures along a transform fault should be essen¬ 
tially the average of the expected temperature on the two sides; 
coolest at the transform midpoint and hottest at either end 
(Fig. 5.3-13). As expected from the area available for fault¬ 
ing, the maximum seismic moment for transform earthquakes 
decreases with spreading rate (Fig. 5.3-14), consistent with the 
idea of faulting limited to a zone bounded by the isotherms. 

An interesting question is how the seismic moments of trans¬ 
form earthquakes relate to the plate motion. The average slip 
rate from earthquakes can be inferred from the total seismic 
moment released on a transform, assuming that 

. total seismic moment 

seismic slip rate =-—--. (20) 

(fault area)(rigidity)(time period) 

5 Recall (Section 4.6) that the seismic moment is the product of the rigidity, the slip 
in the earthquake, and the fault area. 
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Fig. 5.3-13 Thermal model of the Romanche Transform. 

Top: Temperatures on either flank predicted by the cooling halfspace 
model. Bottom: Average temperature distribution along the transform 
(After Engeln etal., 1986./. Geophys. Res., 91, 548-77, copyright by 
the American Geophysical Union.) 
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Fig. 5.3-14 Seismic moment versus spreading rate for oceanic transforms. 
The maximum moment decreases with spreading rate, as expected from 
thermal considerations. (After Solomon and Burr, 1979. Tectonophysics, 

55,107-26, with permission of Elsevier Science.) 
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Fig. 5.3-15 The Easter microplate on the East Pacific rise. Top : Seismicity 
(dots) and focal mechanisms in the microplate region. Note the normal 
faulting on the southern boundary. (After Engeln and Stein, 1984.) 
Bottom : Schematic model for the evolution of a rigid microplate between 
two major plates by rift propagation. Successive isochrons illustrate the 
northward propagation of the east ridge, slowing of spreading on the west 
ridge, the rotation of the microplate, the reorientation of the two ridges, 
and the conversion of the initial transform into a slow and obliquely 
spreading ridge. (Engeln etai, 1988./. Geophys. Res., 93, 2839-56, 
copyright by the American Geophysical Union.) 


Using this relation requires inferring the fault area, which 
depends on both the transform length and the depth to which 
faulting occurs. Assuming the area above the 600-700 °C 
isotherms fails seismically, the seismic slip rate for major 
Atlantic transforms is generally less than predicted by the plate 
motion. Thus, if the time period sampled is long enough to 
be representative — a major question — some of the plate 
motion occurs aseismically. The issue of how much slip occurs 
seismically remains unresolved, as we will see when we discuss 
subduction zones (Section 5.4.3) and intraplate deformation 
zones (Section 5.6.2). 

In addition, seismology helps study how ridge-transform 
systems evolve. For example, the East Pacific rise near Easter 
Island contains two approximately parallel sections (Fig. 5.3- 
15, top). Earthquakes occur on these ridges, but not between 
them, suggesting that the area in between is an essentially rigid 
microplate. The normal fault earthquakes on the microplate’s 
southern boundary are surprising because the East Pacific 
rise here is a very fast-spreading (15 cm/yr) ridge, which should 
not have normal fault earthquakes (Fig. 5.3-12). Magnetic 
anomalies show that the east ridge segment is propagating 
northward and taking over from the old (west) ridge segment. 
Figure 5.3-15 ( bottom ) shows a simplified model of this pro¬ 
cess. Because finite time is required for the new ridge to transfer 
spreading from the old ridge, both ridges are active at the 
same time, and the spreading rate on the new ridge is very slow 
at its northern tip and increases southward. As a result, the 
microplate rotates, causing compression (thrust faulting) and 
extension (normal faulting) at its north and south boundaries, 
respectively. Ultimately the old ridge will die, transferring 
lithosphere originally on the Nazca plate to the Pacific plate, 
and leaving inactive fossil ridges on the sea floor. Both V- 
shaped magnetic anomalies characteristic of ridge propagation 
and fossil ridges are widely found in the ocean basins, showing 
that this is a common way that ridges reorganize. Even for 
smaller (a few km) propagating ridge systems, studies of the 
associated earthquakes can yield useful information about the 
propagation process. 


5.4 Subduction zones 

We have seen that earthquakes at spreading centers, which at 
shallow depths are upwelling limbs of the mantle convection 
system, reflect the processes forming oceanic lithosphere there. 
In a similar way, earthquakes at subduction zones, downwell- 
ing limbs of the convection system, reflect the processes by 
which oceanic lithosphere reenters the mantle. Plate conver¬ 
gence takes different forms, depending on the plates involved. 
Figure 5.4-1 shows the basic model for a situation where 
oceanic lithosphere of one plate subducts beneath oceanic 
lithosphere of the overriding plate. Typically, a volcanic island 
arc forms, and sea floor spreading occurs behind the arc, 
forming a back-arc basin or marginal sea. Earthquakes occur 
both at the trench and to great depth, forming a dipping 
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Andesitic arc volcanism 



Fig. 5.4-1 Schematic diagram of processes 
associated with the subduction of one 
oceanic plate beneath another. 


Wadati-Benioff zone. By contrast, when oceanic lithosphere 
subducts beneath a continent, a mountain chain like the Andes 
forms on the continent, and the oceanic lithosphere forms 
a Wadati-Benioff zone. Finally, because continental crust 
cannot subduct, convergence between two continental plates, 
as in the Himalayas, causes crustal thickening, mountain build¬ 
ing, and shallow earthquakes but does not create a Wadati- 
Benioff zone. 

Subduction zones have a wide variety of earthquakes with 
different focal mechanisms and depths. There are shallow (less 
than 70 km deep), intermediate (70-300 km deep), and deep 
(more than 300 km deep) focus earthquakes. 1 These earth¬ 
quakes occur in different tectonic environments. The inter¬ 
mediate and deep earthquakes forming the Wadati-Benioff 
zone occur in the cold interiors of downgoing slabs. The shallow 
earthquakes are associated with the interaction between the 
two plates. The largest and most common of these shallow 
earthquakes occur at the interface between the plates, and 
release the plate motion that has been locked at the plate inter¬ 
face. In addition, shallow earthquakes can occur within both 
the overriding and the subducting plates. Figure 5.4-2 shows 
some features of seismicity observed in subduction zones. Not 
all features have been observed at all places. For example, the 
dips and shapes of subduction zones vary substantially. Some 
show double planes of intermediate or deep seismicity, whereas 
others do not. 

In discussing subduction zones, we follow an approach 
similar to that used in the last section for ridges. We introduce 
thermal models for subduction, then use them to gain insight 
into earthquake and seismic velocity observations. We will see 
that seismological observations, thermal models, and calcula¬ 
tions of the behavior of materials at high temperature and pres¬ 
sure are combined to investigate these complicated regions. In 
general, the seismological observations are fairly clear, but they 
can be interpreted in terms of a variety of models. As a result, 
subduction zone studies remain active, fruitful, and exciting. 


1 Slightly different definitions have been used for these depth ranges; for example, 
325 km has also been used as the upper limit for deep earthquakes. 
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Fig. 5.4-2 Composite subduction zone showing some earthquake types. 
Not all are observed at all subduction zones. 


5.4.1 Thermal models of subduction 

The essence of subduction is the penetration and slow heating 
of a cold slab of lithosphere as it descends into the warmer 
mantle. As we will see, slabs subduct rapidly compared to the 
time needed for heat conducted from the surrounding mantle 
to warm them up. Thus they remain colder, denser, and me¬ 
chanically stronger than the surrounding mantle. Consequently, 
slabs transmit seismic waves faster and with less attenuation 
than the surrounding mantle, making it possible to map slabs 
and to show that deep earthquakes occur within them. More¬ 
over, the negative thermal buoyancy of cold slabs appears to be 
the primary force driving plate motions and provides a major 
source of stress within them that causes deep earthquakes. 

To explore the thermal evolution of slabs, we use two 
approaches. First, we discuss a simplified analytic thermal 
model that allows insights into the physics. We then discuss 
numerical models that incorporate additional effects in the 
hope of providing a more realistic description. We highlight 
some significant points, and more complete information can be 
found in the references. 
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Fig. 5.4-3 An analytic model for 
temperatures in a subducting plate. 

Left: model geometry. Right: Results, 
showing the cold slab heating up as it 
descends through the hotter surrounding 
mantle. 




The analytic model (Fig. 5.4-3) considers a semi-infinite slab 
of thickness L subducting at rate v. The surrounding mantle is 
at temperature T m , and the plate enters the trench with a linear 
temperature gradient from T~ 0 at its top to T m at its base. We 
define the x axis down the dip of the slab, and the y axis across 
the slab. The evolution of the region is given by a slightly more 
complicated version of the heat equation (Eqn 5.3.1) used to 
model the cooling of the lithosphere as it moves away from the 
ridge. This version, 


pCp 


( 


V 


— + pVT 
dt 




J 


= V-(&VT) + £, 


( 1 ) 


R , the dimensionless thermal Reynolds number, is the ratio of 
the rate at which cold material is subducted to that at which it 
heats up by conduction. This solution resembles the temperat¬ 
ure field in the plate model of cooling lithosphere (Eqn 5.3.19), 
because both models describe the thermal evolution of a plate 
of finite thickness with temperature boundary conditions at the 
top, bottom, and one end. In the previous case the plate cools, 
whereas in this case it heats up. 

To find how far along the slab a given isotherm penetrates, 
we approximate the series by its first term and use the fact that 
R »7zr, so 

T(x, y) » TJ1 - (2 In) exp (-tz 2 xI(2RL)) sin (nylL)]. (4) 


describes the evolution of the temperature field, T(%, y, t), as a 
function of time and the two space coordinates. In addition to 
the heat conduction term V • (&VT), Eqn 1 includes a vVT term 
describing the transfer (or advection) of heat by movement of 
material, and the £ term representing additional sources or 
sinks of heat such as radioactivity and phase changes. This 
form allows key parameters such as the density p, specific heat 
C pi thermal conductivity k , and heat sources or sinks £ to vary 
with position. For a simple analytic solution, we assume that 
the problem is steady state (377 dt = 0) and neglect heat sources 
and sinks (e= 0). We further assume that the physical propert¬ 
ies of the material (p, C p , k, and hence the thermal diffusivity 
K ~ k/pCp) are independent of position. 

With these simplifications, Eqn 1 becomes 


„ 3T ,( d 2 T d 2 L 

P pV dx + 3 y 1 J 

which has a series solution 


( 2 ) 


T{x,y) = T m [ 1 + 2^ c n exp {-fi n x/L) sin (nny/L)], (3) 

n =1 


with 

c n = (-1 ) n !{nn), fl n = {R 2 + n 2 n 2 ) m -R, R = vLI{2k). 


Solving for the point where dT/dy = 0 yields y = L/2, the middle 
of the slab. In fact, taking additional terms shows that this 
point is actually closer to the colder top (Fig. 5.4-3). Using the 
first-term approximation, a temperature T 0 goes furthest into 
the subduction zone at 

T 0 (x 0 , L/2) = T m [ 1 - (2 in) exp (-ti 2 x 0 /(2RL)\, (5) 

and reaches a maximum down-dip distance 

x 0 = -vL 2 /(tz 2 k) In [K(T m -T 0 )/(2T m )]. (6) 

To convert this distance to depth in the mantle, we multiply by 
sin S, where S is the slab dip. This correction converts the 
subduction rate v to the slab’s vertical descent rate v sin 5. 
Thus an isotherm’s maximum depth is proportional to the 
subduction rate and the square of the plate thickness, so faster 
subduction or a thicker slab allows material to go deeper before 
heating up. If we assume that the square of the plate thickness is 
proportional to its age, the maximum depth to an isotherm in 
the downgoing slab is proportional to the vertical descent rate 
times the age, £, of the subducting lithosphere. 

This idea can be tested by assuming, as we did for spreading 
center earthquakes, that the maximum depth of earthquakes 
is temperature-controlled, so earthquakes should cease once 
material reaches a temperature that is too high. To compare 
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Fig. 5.4-4 Maximum earthquake depths for different subduction zones 
as a function of thermal parameter, the product of vertical descent rate 
and lithospheric age. If earthquakes are limited by temperature, this 
observation is consistent with the simple thermal model’s prediction 
that the maximum depth to an isotherm should vary with the thermal 
parameter. (After Kirby et al ., 1996b. Rev. Geophys., 34, 261-306, 
copyright by the American Geophysical Union.) 


various subduction zones, we examine the maximum depth of 
earthquakes as a function of their thetmal patametev 

(j)=tv sin 8, ( 7 ) 

Figure 5.4-4 shows that the maximum depth increases with 
thermal parameter, and deep earthquakes below 300 km occur 
only for slabs with a thermal parameter greater than about 
5000 km. 

However, the fact that the earthquakes stop does not mean 
that the slab has equilibrated with the surrounding mantle. 
Figure 5.4-5 shows the predicted minimum temperature within 
a slab as a function of time since subduction, assuming it 
maintains its simple planar geometry and does not buckle or 
thicken. The coldest portion reaches only about half the mantle 
temperature in about 10 Myr, which is about the time required 
for the slab to reach 660 km. Thus the restriction of seismicity 
to depths shallower than 660 km does not indicate that the slab 
is no longer a discrete thermal and mechanical entity. From a 
thermal standpoint, there is no reason for slabs not to penetrate 
into the lower mantle, an issue we discuss shortly. If a slab 
descended through the lower mantle at the same rate (in fact, 
it would probably slow down due to the more viscous lower 
mantle), it would retain a significant thermal anomaly at the 
core-mantle boundary, consistent with some models of that 
region (Section 3.8.4). 2 

The thermal model can be improved with simple modific¬ 
ations. Although we assumed that the slab subducts into an 
isothermal mantle, temperature should increase with depth, 

The oceanic lithosphere takes about 70 Myr to cool to equilibrium with the mantle 
below, and so takes about half that time to heat up again from both sides after it 
subducts. 



Fig. 5.4-5 Minimum temperature within a slab as a fraction of the mantle 
temperature, as a function of the time since subduction, computed using 
the analytic thermal model (Fig. 5.4-3). The coldest portion reaches half 
the mantle temperature in about 10 Myr, by which time a typical slab is 
approximately at 670 km depth, and 80% of it in 40 Myr, by which time 
a slab that continued descending at the same rate would reach the core¬ 
mantle boundary. Slabs can thus remain thermally distinct for long 
periods of time. (Stein and Stein, 1996. Subduction , 1-17, copyright 
by the American Geophysical Union.) 


as the material is compressed due to increasing pressure from 
the overlying rock. Because the mantle below the lithosphere 
is thought to be convecting, it is often assumed that self- 
compression occurs adiabatically, such that material moving 
vertically neither loses nor gains heat. In this case, equilibrium 
thermodynamics requires that the effects of temperature and 
pressure changes exactly offset each other, 

d S= < ^dT-“dP = 0 , {8) 

T p w 


so that the entropy S does not change. This condition gives the 
adiabatic temperature gradient, or adiabat, as 


dT 


dP 


a 


pC t 


T, 


(9) 


where ot is the coefficient of thermal expansion. Because pres¬ 
sure increases with depth as dPldz - pg , temperature increases 
with depth as 


'dT) = a lj 

, dz J s C p 


( 10 ) 


We can thus correct the temperatures for the isothermal mantle 
case to include adiabatic heating. Using the entropies requires 
using absolute (Kelvin) temperatures, equal to the Celsius 
temperature plus 273.15°. Thus if the absolute temperature at 
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depth z 0 , the base of the plate, is T q, we integrate Eqn 10 to find 
the absolute temperature at depth z, 

T K (z) = T%zxp[(ag/C p )(z~z 0 )]. (11) 

Another possibly important effect is that of heat sources and 
sinks. For example, the olivine to spinel transition, which gives 
rise to the 410 km discontinuity outside the slab, should release 
heat as it occurs in the slab. Heat might also be generated by 
friction at the top of the downgoing slab. The heat produced 
is the product of the subduction rate and the shear stress on 
the slab interface. The magnitude of this effect is difficult to 
estimate. It should not be significant unless the shear stress is 
greater than a few kilobars. As discussed later (Section 5.7.5), 
the stress on faults is unknown. A further complexity results 
from the fact that the viscosity of the mantle, which controls 
the stress, decreases exponentially with temperature. Thus, if 
frictional heating raises the temperature at the slab interface, 
viscosity, and hence stress, would decrease, tending to counter¬ 
act the effect. 

To address these complexities, we use numerical models to 
solve the heat equation at every point in the slab. These models 
allow parameters such as density to vary with position. In addi¬ 
tion, heat sources and sinks such as radioactive heating, phase 
changes, and frictional heating can be incorporated. The 
results of such calculations are similar to those of the analytic 
model and are used to explore how temperatures should 
vary between subduction zones. For example, Fig. 5.4-6 com¬ 
pares models for a relatively younger and slower-subducting 
slab (thermal parameter about 2500 km), approximating the 
Aleutian arc, and an older, faster-subducting slab (thermal 
parameter approximately 17,000 km), approximating the 
Tonga arc. As expected, the slab with the higher thermal 
parameter warms up more slowly, and is thus colder. This 
prediction is consistent with the observation that Tonga has 
deep earthquakes, whereas the Aleutians do not (Fig. 5.4-4). 

Although we can compute such thermal models, a question is 
whether they make sense. We test them using two seismological 
datasets: earthquake locations and seismic velocities. Travel 
time tomography (Section 7.3) across subduction zones shows 
high-velocity slabs (Fig. 5.4-7). These results are compared to 
the velocities predicted using a thermal model of the sub¬ 
ducting slab and laboratory values for the variation in velocity 
with temperature. The model predicts coldest temperatures in 
the slab interior where the earthquakes occur. Because the 
tomographic inversion finds the velocity within rectangular 
cells, the model is converted to that grid and then “blurred” 
because the seismic rays do not uniformly sample the slab. As 
shown by the hit count, the number of rays sampling each cell, 
most rays go down the high-velocity slab, yielding a somewhat 
distorted image. The fact that this image and the tomographic 
result are similar suggests that the model is a reasonable de¬ 
scription of the actual slab. A similar conclusion emerges from 
the observation that the tomographic result also resembles 
parts of the model image that are artifacts, velocity anomalies 




Fig. 5.4-6 Comparison of thermal structure for a relatively younger, 
slower-subducting slab (50 Myr-old lithosphere subducting at 70 mm/yr; 
thermal parameter about 2500 km), which approximates the Aleutian arc, 
and an older, faster-subducting slab (140 Myr-old lithosphere subducting 
at 140 mm/yr; thermal parameter about 17,000 km) which approximates 
the Tonga arc. (Stein and Stein, 1996. Subduction , 1-17, copyright by the 
American Geophysical Union.) 


that are not present in the original model. These artifacts, gen¬ 
erally of low amplitude, cause the slab to appear to broaden, 
shallow in dip, or flatten out. Hence, although slab thermal 
models are simplifications of complicated real slabs, and many 
key parameters are not well known, it seems likely that the 
models are reasonable approximations (perhaps accurate to a 
few hundred degrees) to the temperatures within actual slabs. 

Seismology provides other tools to study the contrast 
between the cold, rigid, downgoing plate and the hotter, less 
rigid material around it. Figure 3.7-20 showed that a cold slab 
transmits seismic energy with less attenuation than its sur¬ 
roundings. Figure 5.4-8 shows some of the earliest data for this 
effect: seismograms from a deep earthquake are contrasted at 
stations NIU, to which waves travel through the downgoing 
plate; and VUN, to which waves arrive through the surround¬ 
ing mantle. The VUN record shows much more long-period 
energy, especially for 5 waves, than that at NIU. Thus the 
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high-frequency components were more absorbed on the 
path to VUN due to higher attenuation (lower Q) than on the 
more rigid slab path to NIU. In addition, the sharp contrast In 
seismic velocity at the top of slabs can be detected using 
reflected and converted seismic waves (Fig. 2.6-15). 

5.4.2 Earthquakes in subducting slabs 

The deep and intermediate earthquakes forming the Wadati- 
Bern off zone extend in some places to depths of almost 700 km 
(Fig. 5.4-9). These are the deepest earthquakes that occur: 
away from subduction zones, earthquakes below about 40 km 
are rare. The Wadati-Benioff zone earthquakes illustrate that 
material cold enough to fail seismically (rather than flow) is 
being subducted, and give our best information about the 
geometry and mechanics of slabs. 

The number of earthquakes as a function of depth illustrates 
why we distinguish intermediate and deep earthquakes; seis¬ 
micity decreases to a minimum near about 300 km, and then 
increases again. Deep earthquakes, those below about 300 km, 


are thus generally treated as distinct from Intermediate earth¬ 
quakes. Deep earthquakes peak at about 600 km, and then 
decline to a minimum before 700 km. The focal mechanisms 
also vary with depth; those shallower than 300 km show gen¬ 
erally down-dip tension, whereas those below 300 km show 
generally down-dip compression (Fig. 5.4-10). 

Various explanations for this distribution of earthquakes 
and focal mechanisms are under consideration. One is that 
near the surface the slab is extended by its own weight, whereas 
at depth it encounters stronger lower mantle material, caus¬ 
ing down-dip compression. Another possible factor may be 
mineral phase changes that occur at different depths in the cold 
slab than in the surrounding mantle. 

It is generally assumed that the most crucial effect is the 
negative buoyancy (sinking) of the cold and dense slabs. The 
thermal model gives the force driving the subduction due to 
the integrated negative buoyancy of a slab resulting from the 
density contrast between it and the warmer and less dense 
material at the same depth outside. Because the slab does not 
have a discrete lower end in the analytic model, the net force is 
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Fig. 5.4-8 Seismological observations showing the difference between the 
cold slab and hotter ambient mantle. Comparison of the seismograms at 
NIU and VUN shows that high frequencies are transmitted better by the 
slab, so the slab is a less attenuating, or higher Q path. (Oliver and Isacks, 
1967. J. Geophys. Res., 72 , 4259-75, copyright by the American 
Geophysical Union.) 



Deep events: individual poles 


Fig. 5.4-10 Stress orientations inferred 
from focal mechanisms of subduction zone 
earthquakes. The P and T axes are rotated 
so that the down-dip direction is at the 
center of each plot, and their distributions 
are contoured. Top: Events below 300 km 
are dominated by down-dip compression. 
Bottom: Events from 70-300 km are 
dominated by down-dip tension. (After 
Vassiliou, 1984. Earth Planet. Sci. Lett., 
69, 195-201, with permission from 
Elsevier Science.) 
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a. Body force 


c. Compression 


F= g[p(x,y)~p m ]dxdy. 


If material outside the slab is at temperature T m and density p 
material in the slab at the point (x, y) has density 

P(*> y)^P m +~ [T(x, y) - TJ=p m + p'( x , y). (13) 

As for the cooling plate (Eqn 5.3.9), the density perturbation is 

p'(x,y) = ap m [T m -T(x,y)], (14) 

so for the analytic temperature model (Eqn 3) the integral over 
the slab yields a force 


p _ S a P 



b. Tension 


Fixed T = pgL 





T=pgL/2 


Fixed T=-pgL/2 


This force, known as “slab pull,” is the plate driving force 
due to subduction. Specifically, it is the negative buoyancy 
associated with a cold downgoing limb of the convection 
pattern. Its significance for stresses in the downgoing plate and 
for driving plate motions depends on its size relative to the 
resisting forces at the subduction zone. There are several such 
forces. As the slab sinks into the viscous mantle, the material 
displaced causes a force depending on the viscosity of the man¬ 
tle and the subduction rate. The slab is also subject to drag 
forces on its sides and to resistance at the interface between the 
overriding and downgoing plates, which is often manifested 
as earthquakes. 

To gain insight into the relative size of the negative buoy¬ 
ancy (“slab pull”) and resistive forces, we consider the stress 
in the downgoing slab and the resulting focal mechanisms. 
Figure 5.4-11 shows a simple analogy, the stress due to the 
weight of a vertical column of length L of material with density 
p. Using the equilibrium equation (Eqn 2.3.49), we equate the 
stress gradient to the body force, 

~^~=- p8 ’ ( 16 ) 


Fig. 5.4-11 Stress within a vertical column of material under its own 
weight, a simple analogy to stress within a downgoing slab. For the same 
body force, different stress distributions result from different boundary 
conditions. If the load is supported at the bottom, the column is under 
compression; if the support is at the top, the column is under tension. 

A combination of the two produces a transition. 


which is negative, corresponding to compression everywhere. 
The forces required at the top and the bottom to maintain equi¬ 
librium are given by the relation between the traction, stresses, 
and outward normal vector on a surface (Eqn 2.3.8), 

T z =(J zz n z • (19) 

At the top T z ( 0) = 0, whereas at the bottom a force 


holds the column up. This situation is like a column of material 
sitting on the earth s surface, under compression everywhere. 

Alternatively, suppose the stress is zero at the bottom. In this 
case the constant is chosen so that 


so the stress as a function of depth is found by integration, ° zz ^ Pg ^ L ^ 


G zz( z ) ~~PS Z + Q (^7) 

where C is a constant of integration. To determine C, and thus 
the stress in the column, the boundary conditions must be 
known. 

First, suppose the stress is zero at the top, z = 0. In this case 
C = 0 and 

G J z )=-pgz, hq\ 


and the column is in extension ( a zz positive) everywhere. The 
force at the bottom is zero, and the force at the top, 

T z (0)=pgL , (22) 

supports the column, because n z points in the -z direction. This 
situation corresponds to the material hanging under its own 
weight. 

If the column is supported equally at both ends, the forces at 
either end are equal, so we find the stress from the condition 
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Fig. 5.4-12 The absolute velocity of lithospheric plates increases with the 
fraction of the plate’s boundary formed by subducting slabs, suggesting 
that slabs provide a major driving force for plate motions. (Forsyth and 
Uyeda, 1975.) 



Fig. 5.4-13 Phase diagram for transitions in olivine with increasing depth. 
The phase boundaries as functions of temperature and pressure are known 
as Clapeyron curves. The down welling and up welling lines contrast 
conditions in slabs and plumes, respectively, to those in the ambient 
mantle. A reaction with a positive slope, such as the olivine (a phase) to 
spinel (/? phase) change thought to give rise to the 410 km discontinuity 
outside the slab, is displaced upward (to lower pressure) within the cold 
slab. By contrast, the y spinel to perovskite plus magnesiowustite (pv + 
mw) transition has a negative slope, so the 660 km discontinuity should 
be deeper in slabs than outside. (After Bina and Liu, 1995. Geophys. 

Res. Lett., 22, 2565-8, copyright by the American Geophysical Union.) 


T z (0) = -T z (L), (23) 

which gives 

a zz {z)=pg{LI2~z). (24) 

Thus the column is in extension in its upper half, z<L/ 2, and in 
compression below this point. 

The stress in the column shows how the body force due to 
gravity is balanced by forces on the boundaries. By analogy, if 
the downgoing slab were in tension, the negative buoyancy 
force must exceed the resistive forces at the subduction zone, 
and the slab would be “pulling” on and supported by the 
remainder of the plate outside the subduction zone. In fact, 
most earthquakes in the deeper portions of the slab show 
down-dip compression, whereas the intermediate earthquakes 
show down-dip tension (Fig. 5.4-10). This situation is like the 
column supported at both ends. 

These ideas about the forces within subduction zones are 
consistent with two important pieces of data. First, the average 
absolute velocity of plates increases with the fraction of their 
area attached to downgoing slabs (Fig. 5.4-12), suggesting 
that slabs are a major determinant of plate velocities. Second, 
as discussed in Section 5.5.2, earthquakes in old oceanic 
lithosphere have thrust mechanisms, demonstrating deviatoric 
compression. Thus the net effect of the subduction zone on the 
remainder of the plate is not a “pull,” so the term “slab pull” 
is misleading. Instead, as implied by the slab stress models, 
the “slab pull” force is balanced by local resistive forces, a com¬ 
bination of the effects of the viscous mantle and the interface 


between plates. This situation is like an object dropped in a 
viscous fluid, which is accelerated by its negative buoyancy 
until it reaches a terminal velocity determined by its density and 
shape and the viscosity and density of the fluid. 

An interesting possible complication is that slabs are not just 
thermally different from their surroundings; they are probably 
also mineralogically different. Slabs extend through the mantle 
transition zone, where mineral phase changes are thought to 
occur (Section 3.8). However, because a downgoing slab is 
colder than material at that depth elsewhere, phase changes 
within the slab are displaced relative to their normal depth. 
The displacement can be calculated using the thermodynamic 
relation, known as the Clapeyron equation, for the boundary 
between two phases as a function of pressure and temperature. 
If AH and AV are the heat and volume changes resulting from 
the phase change, then a change dT in temperature moves the 
phase change by a pressure dP given by the Clapeyron slope 
(the reciprocal of Eqn 9), 


dP _ AH 
dT ~~ TAV 


(25) 


For example, the 410 km discontinuity is attributed to the 
phase change with increased pressure from olivine to a denser 
spinel structure (the p phase, wadsleyite) described by a phase 
diagram like that in Fig. 5.4-13. Because the spinel phase is 
denser, AV is less than zero. This reaction is exothermic (gives 
off heat), so AH is also negative, causing a positive Clapeyron 
slope. If we know the depth (pressure) and temperature at 
which a phase change occurs in the mantle, the Clapeyron 
equation gives its position in the slab. The slab is colder than 
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rig p.4-14 I redicted mineral phase boundaries 
and resulting buoyancy forces in a downgoing slab 
without {left panels) and with [right panels) a 
metastable olivine wedge. Assuming equilibrium 
mineralogy the cold slab has negative thermal 
buoyancy, negative compositional buoyancy 
associated with the elevated 410 km discontinuity 
and positive compositional buoyancy associated ’ 
with the depressed 660 km discontinuity. 

A metastable wedge gives positive compositional 
buoyancy and hence decreases the force driving 
subduction. Negative buoyancy favors subduction 
whereas positive buoyancy opposes it. (Stein and 
Ruble, 1999. Science, 286 ,909-10, copyright 1999 
American Association for the Advancement of 
Science.) 
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The deflections of the phase boundaries have several pos¬ 
sible consequences. First, phase changes affect the thermal 
structure of the slab due to the heat of the phase change. Thus 
the exothermic olivine-spinel change should add heat to 
slabs. This effect is simulated in thermal models by increas¬ 
ing the temperature at the phase change. Second, the phase 
boundaries are probably important for the buoyancy and 
stresses within slabs. We have already discussed the idea that 
the cold slabs are denser than their surroundings, causing 
negative thermal buoyancy, which favors sinking. The phase 
boundaries cause additional mineralogical buoyancy. For 
example, if the olivine-spinel boundary is uplifted in the slab, 
the presence of slab material denser than at that depth outside 
causes additional negative buoyancy. However, if a wedge of 
metastable olivine exists, it would be less dense than material at 


that depth outside and produce positive buoyancy (Fig. 5.4-14) 
in addition to that caused by the downward deflection of the 
660 km discontinuity. Although the net buoyancy must be 
negative because slabs subduct, the details of the buoyancy can 
be important. For example, metastable olivine may help regu¬ 
late subduction rates. Faster subduction would cause a larger 
wedge of low-density metastable olivine, reducing the driving 
force and slowing the slab. 

A third possibility is that a phase change causes deep 
earthquakes. Although this idea is a natural consequence of 
the observation that deep earthquakes occur at transition 
zone depths, it was not given serious consideration for a long 
time because deep earthquake focal mechanisms show slip 
on a fault, rather than isotropic implosions (Section 4.4.6). 
However, laboratory studies now suggest that an instability 



Depth (km) 


Fig. 5.4-16 Numerical model of mantle flow fields (lower left) and resulting stresses (upper right) within a downgoing slab for the cases of a slab that (A) 
encounters higher-viscosity material below 670 km and (B) cannot penetrate below this depth, rj values show relative viscosities. Both predict down-dip 
tension in the upper portion of the slab and down-dip compression in the lower portion. The calculated stresses are highest near the bottom of the slab. 
(Vassiliou etaL, 1984./. Geodynam., 1, 11-28, with permission from Elsevier Science.) 
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called transformational faulting can cause slip along thin shear 
zones where metastable olivine transforms to denser spinel. 
Such faulting can occur for the exothermic olivine to spinel 
transition, but not for the endothermic spinel to perovskite plus 
magnesiowustite transition, so deep earthquakes would occur 
only in the transition zone. Because the metastable wedge’s 
lower boundaries are essentially isotherms, this model offers a 
physical mechanism for the observation (Fig. 5.4-4) that the 
depth of earthquakes increases with thermal parameter. This 
idea is attractive, but to date seismological studies show no 
evidence for a metastable wedge, and large deep earthquakes 
occur on fault planes that appear to extend beyond the bounda¬ 
ries of the expected metastable wedge. If such wedges exist, 
earthquakes may nucleate by transformational faulting, but 
then propagate outside the wedge via another failure mecha¬ 
nism. 


Fig. 5.4-17 Numerical models of stresses 
within a downgoing slab assuming the 
density distribution corresponding to 
equilibrium mineralogy {left panels) and 
with metastable olivine [right panels). 
Upper panels show stress orientations, and 
lower panels show stress magnitudes, with 
compression as negative, compared to the 
distribution of seismicity (lower center). 
(Bina, 1997. Geophys. Res. Lett., 24, 

!bU JU 0 ~ 50 3301-4, copyright by the American 

<7nax (MPa) Geophysical Union.) 

Together these ideas offer several possible explanations 
for features of slab earthquakes. One key feature is the depth 
variation in seismicity and focal mechanisms. The first ex¬ 
planation is that the depth distribution and stresses are largely 
due to the negative thermal buoyancy of slabs and their en¬ 
countering either a region of much higher viscosity or a barrier 
to their motion at the 660 km discontinuity. Numerical models 
(Fig. 5.4-16) predict stress orientations similar to those implied 
by the focal mechanisms. Moreover, the magnitude of the 
stress varies with depth in a fashion similar to the depth dis¬ 
tribution of seismicity — a minimum at 300-410 km and an 
increase from 500 to 700 km. Alternatively, numerical models 
including the buoyancy effects of the phase changes (Fig. 5.4- 
14) also predict a similar variation in stress magnitude and 
orientation with depth (Fig. 5.4-17), without invoking a 
barrier or higher viscosity in the lower mantle. Thus, in such 
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Fig. 5.4-18 North-south cross-section 
showing seismicity of subduction zones of 
the Northwest Pacific. Seismicity shallows 
near the cusps where arcs meet, making 
individual Wadati-Benioff zones tongue¬ 
shaped. Large deep earthquakes (M 0 greater 
than 10 26 dyn-cm), shown by open circles, 
tend to be at the edges or bottoms of deep 
seismicity, or isolated from the main 
Wadati-Benioff zones. (Kirby etal., 1996b. 
Rev. Geophys., 34, 261-306, copyright by 
the American Geophysical Union.) 
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models, deep earthquakes need not be physically different from 
intermediate ones, because the minimum in seismicity reflects 
a stress minimum. 

A second key issue is how deep earthquakes can occur at all. 
As discussed in Section, 5.7, the strength of rock that must be 
exceeded for fracture increases with pressure. The pressures 
deep in a subducting slab should be high enough to prevent 
fracture. One possibility is that the slabs become hot enough 
that water released by decomposition of hydrous minerals 
lubricates (reduces the effective stress on) faults. Another 
possibility, mentioned earlier, is transformational faulting in 
metastable olivine. It is also possible that the earthquakes occur 
by very rapid creep, possibly associated with weakening due to 
unusually small spinel grains formed in the coldest slabs. 

The different explanations offered by these models all have 
attractive features and may be true in part. However, although 
such simple models based on idealized slabs explain some gross 
features of deep earthquakes, none fully explains the com¬ 
plexity of deep earthquakes. As shown by Fig. 5.4-18, a cross- 
section along the subduction zones of the Northwest Pacific, 
deep seismicity is “patchy” and variable. For example, it 
shallows dramatically at the cusps between the Marianas, 
Izu-Bonin, NE Japan, and Kuril-Kamchatka arcs. Moreover, 
the largest earthquakes occur at the edges of the regions of deep 
seismicity, as especially evident at the northern edge of the 
Izu-Bonin seismicity. These sites may reflect tears in the down¬ 
going lithosphere at the junctions between arcs, where hot 
mantle material penetrates slabs. A further complexity is that 
some deep earthquakes occur in unusual locations off the 
down-dip extension of the main Wadati-Benioff zones and 
have focal mechanisms differing from those of the deepest 
earthquakes in the main zone (Fig. 5.4-19). Some other deep 
earthquakes are isolated from actively subducting slabs. Such 
unusual earthquakes may occur in slab fragments where meta¬ 
stable olivine survives, and thus have mechanisms related to 
local stresses rather than those expected for continuous slabs. 



Fig. 5.4-19 Seismicity cross-section for the Fiji subduction zone, showing 
“outlier” deep earthquakes. Lines through symbols show P axes, which 
often differ from those for the main Wadati-Benioff zone. (Lundgren 
and Giardini, 1994. J. Geophys. Res., 99,15, 833-42, copyright by the 
American Geophysical Union.) 

Another interesting observation from precise earthquake 
locations in some subduction zones (Fig. 5.4-20) shows that 
the Wadati-Benioff zone is made up of two distinct planes, 
separated by 30-40 km. The upper plane seems to coincide 
with the conversion plane for ScSp (Fig. 2.6-15), a sharp 
velocity contrast that is presumably near the slab top. Focal 
mechanisms suggest that the upper plane is in down-dip com¬ 
pression and the lower one in down-dip extension. A variety of 
models have been proposed. One is that the double plane re¬ 
sults from “unbending” of the slab — the release of the bending 
stresses produced when the slab began to subduct. Another 
model is that the slab “sags” under its own weight, because at 
depth it runs into a more viscous mesosphere, while at inter¬ 
mediate depths it encounters a less viscous asthenosphere. 
Explaining the phenomenon is complicated by the observation 
that only some subduction zones have double zones. 

The nature of deep earthquakes, especially the mechanism 
restricting them to the transition zone, has implications for 
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5 't' 20 Dou , ble seismic zone beneath Tohoku, Japan. (Hasegawa et al„ 
8. Tectonophysics , 47, 43-58, with permission from Elsevier Science.) 


mantle flow. The simplest explanation for the cessation of deep 
seismicity is that slabs cannot penetrate the lower mantle. 
However as shown in Fig. 5.4-21, tomographic studies (Chap- 
ter 7) indicate tjlat a i though some skbs are deflected at 

660 km, they eventually penetrate deeper. Hence models in 
which earthquakes stop either because the stress is not high 
enough or because the phase changes causing them no longer 
occur seem more likely. The issue is important because heat 
and mass transfer between the upper and lower mantles have 
major implications for the dynamics and evolution of the earth 
(Section 3.8). At present, most models favor some degree of 
communication between the two (Fig. 5.1-2). Slabs are some¬ 
times deflected at the 660 km discontinuity, where they warm 
urther, lose any buoyant metastable wedge, and then penetrate 
into the lower mantle. Thus the slab geometry we see likely re- 
lects a complex set of effects. To cite another, some flat-lying 
slabs at the 660 km discontinuity may be caused by the trench 
rolling backward” in the absolute (mantle) reference frame 
There has also been considerable discussion about the nature 
of intermediate depth earthquakes. Figure 5.4-22 shows a 
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Fig. 5.4-21 Tomographic images across Pacific 
subduction zones with deep earthquakes. Horizontal 
fines are at 410 and 660 km depth. White dots are 
earthquake hypocenters. The Wadati-Benioff zone 
seismicity generally coincides with the high-velocity 
anomaly (dark regions) due to the cold subducting slab. 
Slabs are deflected at the base of the transition zone 
before penetrating into the lower mantle, (van der Hilst 
etal, 1998. The Core-Mantle Boundary Region, 5-20, 
copyright by the American Geophysical Union.) 



Fig. 5.4-22 Schematic model for intermediate depth 
earthquakes. Earthquakes are assumed to occur in 
subducting crust and be associated with the dehydration 
of mineral phases and the gabbro to eclogite transition. 
(Kirby etal., 1996a. Subduction, 195-214, copyright 
by the American Geophysical Union.) 
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schematic model in which the earthquakes are presumed to 
occur in subducting oceanic crust, rather than throughout the 
subducting mantle that makes up most of the slabs, because 
detailed location studies show that the earthquakes are close to 
the top of the subducting slabs. The crust should undergo two 
important mineralogical transitions as it subducts. Hydrous 
(water-bearing) minerals formed at fractures and faults should 
warm up and dehydrate. Eventually, the gabbro transforms to 
eclogite, a rock of the same chemical composition composed 
of denser minerals. 4 Under equilibrium conditions, eclogite 
should form by the time slab material reaches about 70 km 
depth. However, travel time studies in some slabs find a low- 
velocity waveguide interpreted as subducting crust extending 
to deeper depths. Hence it has been suggested that the eclogite- 
forming reaction is slowed in cold downgoing slabs, allowing 
gabbro to persist metastably. Once dehydration occurs, the 
freed water weakens the faults, favoring earthquakes and 
promoting the eclogite-forming reactions. In this model the 
intermediate earthquakes occur by slip on faults, but the phase 
changes favor faulting. The extensional focal mechanisms may 
also reflect the phase change, which would produce extension 
in the subducting crust. Support for this model comes from the 
fact that the intermediate earthquakes occur below the island 
arc volcanoes, which are thought to result when water released 
from the subducting slab causes partial melting in the overlying 
asthenosphere. 

The fact that various explanations are under discussion illus¬ 
trates the difficulty in understanding the complex thermal 
structure, mineralogy, rheology, and geometry of real slabs. 
We can think of the deep subduction process as a chemical 
reactor that brings cold shallow minerals into the temperature 
and pressure conditions of the mantle transition zone, where 
these phases are no longer thermodynamically stable (Fig. 5.4- 
23). Because we have no direct way of studying what is happen¬ 
ing and what comes out, we seek to understand this system by 
studying earthquakes that somehow reflect what is happening. 
This is a major challenge, and we have a long way to go. 

5.4.3 Interplate trench earthquakes 

Much of what is known about the geometry and mechanics of 
the interaction between plates at subduction zones comes from 
the distribution and focal mechanisms of shallow earthquakes 
at the interface between the plates. These include the largest 
earthquakes that occur, as illustrated by Fig. 5.4-24, showing 
the largest earthquakes (surface wave magnitude greater than 
8.0) during 1904-76. Among these are the two largest earth¬ 
quakes ever recorded seismologically: the 1960 Chilean (M 0 2 
x 10 30 dyn-cm, M s 8.3) and 1964 Alaska (M 0 5 x 10 29 dyn-cm, 
M s 8,4) earthquakes. Figure 5.4-25 shows the geometry of the 
Chilean earthquake: 21 meters of slip occurred on a fault 

4 Most of the oceanic crust consists of gabbro, the intrusive version of the extrusive 
basalt seen at mid-ocean ridges (Section 3.2.5). With increasing pressure, gabbro 
becomes eclogite as feldspar and pyroxene transform to garnet. 



Fig. 5.4-23 Cartoon of subducting slabs in the transition zone as a 
chemical reactor. (Kirby etal., 1996b. Rev. Geophys., 34, 261-306, 
copyright by the American Geophysical Union.) 


800 km long along strike, and 200 km wide down-dip. The 
mechanism shows thrusting of the South American plate over 
the subducting oceanic lithosphere of the Nazca plate. The 
aftershock zone was 800 km long, and the surface deformation 
was dramatic, reaching 6 meters of uplift in places. Thrust 
earthquakes of this type, although smaller, make up most of the 
large, shallow events at subduction zones. Such interplate 
earthquakes release the plate motion that has been locked at 
the plate interface. As we saw in Section 4.6.1, these can be 
much bigger than the largest earthquakes at transform fault 
boundaries like the San Andreas. For example, even the 1906 
San Francisco earthquake was tiny (100 times smaller seismic 
moment) compared to the 1964 Alaska earthquake, although 
both occurred along different segments of the same plate 
boundary. The difference reflects the fact that faulting occurs 
only when rock is cooler than a limiting temperature. Thus a 
vertically dipping transform like the San Andreas has a much 
shorter cold down-dip extent than the shallow-dipping thrust 
interfaces (sometimes called megathrusts) at subduction zones. 

Major thrust earthquakes at the interface between sub¬ 
ducting and overriding plates directly indicate the nature of 
subduction. In most cases, their focal mechanisms show slip 
toward the trench, approximately in the convergence direction 
predicted by global plate motion models or space-based geo¬ 
desy (Section 5.2) (Fig. 5.2-3). However, in some cases when 
the plate motion is oblique to the trench, a forearc sliver moves 
separately from the overriding plate (Fig. 5.4-26). This effect. 
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Fig. 5.4-26 Schematic illustration of forearc sliver motion when 
convergence is oblique. (Courtesy of D. Davis.) 

Given such repeatability, it seems likely that a segment of a 
subduction zone that has not slipped for some time constitutes 
a seismic gap and is due for an earthquake. For example, the 
Tokai area (segment D) may be such as case and is the focus of 
extensive earthquake prediction studies. However, despite the 
intuitive appeal of the gap idea, efforts to predict the location 
of future earthquakes using it have not generally been success¬ 
ful (Sections 1.2.5,4.7.3). 

One difficulty is that not all of the plate motion occurs 
seismically. Figure 5.4-28 shows that during 1952-73 a large 
segment of the Kuril trench slipped in a series of six major 
earthquakes with similar thrust fault mechanism. Seismic 
moment studies show that the average slip was 2-3 meters. 
Since the previous major earthquake sequence in the area 
occurred about 100 years earlier, the average seismic slip rate 
is 2-3 cm/yr, about one-third of the plate motion predicted 
from relative motion models. The remaining two-thirds of the 
slip occurs aseismically, as postseismic or interseismic motion. 
Similar studies around the world find that the fraction of plate 
motion that occurs as seismic slip, sometimes called the seismic 
coupling factor, is generally much less than 1, implying that 
much of the plate motion occurs aseismically if the time 
interval sampled is adequate. 

The Chilean subduction zone shows the other extreme. The 
seismic slip rate, estimated from the slip in the great 1960 
earthquake and historical records indicating that major earth¬ 
quakes occurred about every 130 years during the past 400 
years, exceeds the convergence rate predicted by plate motion 
models (Fig. 5.4-29). Because the convergence rate is an upper 
bound on the seismic slip rate, the two estimates are inconsist¬ 
ent. One possibility is that the seismic slip is overestimated: 
either the earlier earthquakes were significantly smaller than 
the 1960 event or their frequency in the past 400 years is higher 
than the long-term average. 

More generally, these examples illustrate the difficulty in 
inferring seismic slip from historical seismicity, owing to pro- 



Fig. 5.4-27 Time sequence of large subduction zone earthquakes along 
the Nankai trough, suggesting both some space and time periodicity 
and some variability. (Ando, 1975. Tectonophysics, 27, 119-40, with 
permission from Elsevier Science.) 


blems including the variability of earthquakes on a given plate 
boundary, the issue of whether the time sample is long enough, 
and the difficulty in estimating source parameters for earth¬ 
quakes that pre-dated instrumental seismology. Given the un¬ 
certainties in estimating the slip in an earthquake even with 
seismological data (Section 4.6), doing so without such data is 
particularly challenging. An alternative approach to estimating 
plate coupling, discussed in Sections 4.5.4 and 5.6.2, uses GPS 
geodesy to measure the deflection of the overriding plate, 
which will be released in future large earthquakes. This deflec¬ 
tion depends on the mechanical coupling at the interface, so 
directly measures what we infer indirectly from the earthquake 
history. However, the GPS data sample only the present earth¬ 
quake cycle, which may not be representative of long-term 
behavior. 

Perhaps for similar reasons, efforts to interpret the seismic 
slip fraction in terms of the physical processes of subduction 
have not yet been successful. Although the term “seismic 
coupling” implies a relation between the seismic slip fraction 
with properties such as the mechanical coupling between the 
subducting and overriding lithospheres, this has been hard 
to establish. This relation was originally posed in terms of 
two end members: coupled Chilean-type zones with large 
earthquakes and uncoupled Mariana-style zones with largely 
aseismic subduction. The largest subduction zone earthquakes 
appear to occur where young lithosphere subducts rapidly 
(Fig. 5.4-30, top), where we might expect the minimum “slab 
pull” effects and hence the strongest coupling. However, 
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Fig. 5.4-28 Rupture areas for a sequence 
of large subduction zone earthquakes along 
the Kuril trench. Different segments of the 
boundary slip seismically over time. Arrows 
show the direction and rate of seismic slip 
and plate motion. If such sequences occur 
about every 100 years and this time sample 
is representative, the seismic slip is only 
about one-third of the plate motion. 
(Kanamori, 1977b. Island Arcs, Deep Sea 
Trenches and Back Arc Basins , 163-74, 
copyright by the American Geophysical 
Union.) 
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i g . 5.4-29 Comparison of seismic slip rate and plate motions for the 
area of the great 1960 Chilean earthquake. Shaded region gives slip rate 
estimated from slip in the 1960 event and recurrence of large trench 
earthquakes m the last 400 years. The estimated slip rate exceeds that 
by an 7 ofthefour Plate motion models shown. (Steiners/., 
1986.Geophys.Res. Lett., 13, 713-16, copyright by the American 
Geophysical Union.) 


efforts to correlate the seismic slip fraction with subduction 
zone properties such as convergence rate or plate age find no 
clear pattern (Fig. 5.4-30, bottom). It has also been suggested 
that seismic coupling may be lowest for sedimented trenches 
and where normal stress on the plate interface is low, although 
these plausible ideas have yet to be demonstrated. Thus al¬ 
ii ough seismic coupling can be defined from the seismic slip 
raction, its relation to the mechanics of plate coupling is still 
unclear. It appears that most subduction zones have significant 
components of aseismic slip, as do oceanic transforms and 
many continental plate boundaries (Section 5.6.2). Hence 
even given the considerable uncertainties in such estimates, it 
appears common for a significant fraction of plate motion to 
occur aseismically. 


The difficulty m estimating seismic coupling and under¬ 
standing the process of aseismic plate motion has consequ¬ 
ences for estimating the recurrence of earthquakes on a plate 
boundary and the seismic gap concept. It may be difficult to 
distinguish between gaps and areas where much of the slip is 
aseismic. For example, we would not want to say both that 
areas with recent major seismicity have high seismic hazard 
and that areas with little recent seismicity are gaps with high 
seismic hazard. Moreover, as discussed in Sections 1.2 and 
4- 7 3, the process of earthquake faulting may be sufficiently 
random that it is hard to use the plate motion rate and seismic 

history to usefully predict how long it will be until the next 
large earthquake. 

Although most shallow subduction zone seismicity is at the 
plate interface, some earthquakes occur within either plate 
Some appear to result from flexural bending of the downgoing 
plate as it enters the trench (Fig. 5.4-31). Focal depth studies 
show a pattern of normal faulting in the upper part of the plate 
to a depth of 25 km, and thrusting in its lower part, between 40 
and 50 km These observations constrain the position of the 
neutral surface dividing the mechanically strong lithosphere 
(Section 5.7.4) into upper extensional and lower compres- 
sional zones. In some cases the normal fault earthquakes are so 
arge that they may be “decoupling” events due to “slab pull” 
that rupture the entire downgoing plate (Fig. 5.4-32). After¬ 
shock distributions and studies of the rupture process indicate 
that faulting extended through a major portion, and perhaps 
all, of the lithosphere. Rupture through the entire lithosphere 
favors the decoupling model. If only a portion of the litho¬ 
sphere breaks, the interpretation is more complicated. Rupture 
may have been restricted to one side of the neutral surface (in 
the flexural model) or reflect the material below being too hot 
and weak for seismic rupture. In the latter case, the entire 
lithosphere could have failed, with the deeper rupture being 
aseismic. & 
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Fig. 5.4-30 Top : Variation in the magnitude (MJ of 
the largest known subduction thrust fault earthquake 
between subduction zones as a function of the convergence 
rate and age of the subducting lithosphere. (Ruff and 
Kanamori, 1980. Phys. Earth Planet. Inter., 23, 240-52, 
with permission from Elsevier Science.) Bottom: Seismic 
coupling fraction estimated from historical seismicity at 
various subduction zones. Although most subduction zones 
show considerable aseismic slip, there is no obvious 
correlation with either age of the subducting lithosphere 
{left) or subduction rate {right). (Pacheco etal., 1993. 

J. Geophys. Res., 98,14, 133-59, copyright by the 
American Geophysical Union.) 
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Fig. 5.4-31 Focal depths of flexural earthquakes due to the bending of 
subducting plates as they enter the trench. Tensional events occur above 
the neutral surface, and compressional events occur below it. The plate 
mechanical thickness, H, increases with age, as expected from thermal 
models. (After Bodine etal., 1981./. Geophys. Res., 86, 3695-707, 
copyright by the American Geophysical Union.) 



Fig. 5.4-32 Large normal faulting earthquakes at trenches, such as the 
1965 M s 7.5 Rat Island earthquake, may be due to flexure or failure of the 
lithosphere under its own weight. The extent of aftershocks, which appear 
not to cut the entire lithosphere, may reflect the extent of rupture or be a 
temperature effect. (Wiens and Stein, 1985. Tectonophysics, 116, 143-62, 
with permission from Elsevier Science.) 
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5.5 Oceanic intraplate earthquakes and tectonics 

The vast majority of earthquakes — especially when measured 
in terms of seismic moment release — occur on plate boundaries 
and reflect the relative plate motions there. However, intra¬ 
plate earthquakes, those within plates, also provide important 
tectonic information. We discuss intraplate earthquakes that 
occur in oceanic lithosphere in this section, and then discuss 
their counterparts in continental lithosphere in the next. 

5 . 5 .1 Locations of oceanic intraplate seismicity 

Figure 5.5-1 illustrates the distribution of earthquakes in the 
Atlantic Ocean, excluding those along the Mid-Atlantic ridge. 
Although these earthquakes are rarer than those along the 



Fig. 5.5-1 Distribution of earthquakes in the Atlantic Ocean other than 
those on ridge and transform segments of the Mid-Atlantic ridge system. 
(Wysession et al ., 1995. © Seismological Society of America. All rights 
reserved.) 


ridges and transforms making up the Mid-Atlantic ridge plate 
boundaries, there are enough to justify interest. They nicely 
illustrate that plates deviate from the ideal case of perfect rigid¬ 
ity without internal deformation, such that all motion occurs at 
narrow boundaries. Instead, as noted in Section 5.2, real plates 
are complicated entities that have both internal deformation 
and diffuse boundary zones. 

One way to think about these earthquakes is to consider a 
hierarchy, from slow-moving plate boundaries, to recognizable 
weak structures, and then to apparently isolated earthquakes. 
For example, the Atlantic portion of the boundary between the 
Eurasian and African plates, which stretches from Gibraltar to 
the Azores, is poorly defined by topography and seismicity 
compared to the Mid-Atlantic ridge. However, the focal 
mechanisms (Fig. 5.5-2, top) show a transition from extension 
at the Terceira Rift near the Azores, to strike-slip along a 
segment that includes the mapped Gloria transform fault, to 
compression near Gibraltar, and then into the Mediterranean. 
This transition reflects the fact that the Euler pole is close 
enough that the relative motions are small and change rapidly 
with distance (Fig. 5.5-2, bottom). For example, near the triple 
junction the NUVEL-1A model (Table 5.2-1) predicts 4 mm/yr 
of extension resulting from the small difference between 
Eurasia-North America (23 mm/yr at N97°E) and Africa- 
North America (20 mm/yr at N104°E) spreading across the 
Mid-Atlantic ridge. Even in the western Mediterranean, the 
motions are too slow to generate a well-developed subduction 
zone like those of the Pacific, but instead cause a broad con¬ 
vergent zone indicated by large earthquakes like the 1980 M s 
7.3 El Asnam, Algeria, earthquake. 

Even slower motion appears to be why sea floor topography 
shows no clear evidence for the boundary between the North 
American and South America plates shown by the dashed 
line in Fig. 5.5-1, despite a diffuse zone of seismicity in this 
area. This zone is considered to be a plate boundary, based on 
detailed studies of plate motions. These studies invert plate 
motion data (spreading rates, transform fault directions, and 
earthquake slip vectors; Section 5.2.2) to find Euler vectors 
under two different assumptions: either there is a single Amer¬ 
ican plate, or there are two. The Euler vectors derived by assum¬ 
ing there are two plates fit the data better, which would be 
expected, because a model with more parameters always fits 
data better. However, statistical tests (Section 7.5.2) show that 
the fit to the data improves more than expected purely by 
chance due to the additional parameters, implying that the two 
plates are distinct. 

The North America-South America Euler vector that results 
from inverting the data is not well constrained, because it is 
not derived directly from data recording the motion between 
North America and South America, but is estimated from 
closure of the plate circuit (Fig. 5.2-5). Thus the estimate of 
motion results from the difference between North America- 
Africa and South America-Africa motions, which are quite 
similar (if they were not, the data would clearly show two dis¬ 
tinct American plates). The predicted motion along the North 
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Fig. 5.5-2 Top : Focal mechanisms along 
the western section of the Eurasia-Africa 
plate boundary. Note the transition from 
extension near the Azores, to strike-slip (the 
Gloria fault is a transform), to compression 
near Gibraltar and into the Mediterranean. 
Bottom : Motions with respect to Africa 
along the boundary predicted by an Euler 
pole slightly south of the mapped area, near 
20°N, 20°W. The dashed line is a small circle 
about this pole. (Argus etaL, 1989. J. 
Geophys. Res., 94, 5585-5602, copyright 
by the American Geophysical Union.) 



America-South America boundary is only about 1 mm/yr — 
much slower than the approximately 20 mm/yr along the Mid- 
Atlantic ridge. The North America-South America boundary 
is thus considered a diffuse, slow-moving boundary zone, 
although its location and motion are not well constrained. 
Another reason for treating this as a boundary zone is that 
paleomagnetic reconstructions find that over the past 70 Myr 
the two plates have moved relative to each other as the Atlantic 
Ocean opened. 

In general, 1-2 mm/yr is an approximate lower limit for 
plate boundary deformation. Regions with motions faster 
than this are generally viewed as plate boundaries, and slower 
deformation is generally treated as intraplate. Flowever, there 
is no generally accepted criterion, and evidence from seismicity 
and topography is also considered. Put another way, in many 
cases one can regard a region as either a slow-moving plate 
boundary zone or a zone of intraplate deformation, and 
“intraplate” earthquakes are often just ones not on an obvious 
plate boundary. 

The Atlantic example (Fig. 5.2-1) shows that in addition 
to the North America-South America boundary zone, some 
intraplate seismicity is concentrated in other areas associated 
with tectonic features. For example, seismicity between Green¬ 


land and North America is likely related to the former spread¬ 
ing ridge that opened this part of the Atlantic (the Labrador 
Sea). Although this spreading stopped about 43 Myr ago, the 
fossil ridge appears to remain a weak zone along which 
intraplate stresses cause some motion. Intraplate seismicity is 
often associated with such fossil structures. Concentrations of 
seismicity are also associated with the Bermuda (32°N, 65°W), 
Cape Verde (17°N, 25°W), and Canary (26°N, 17°W) hot 
spots. Focal mechanism studies are consistent with the earth¬ 
quakes reflecting heating of the lithosphere by the hot spots. 

Hawaii, the most impressive hot spot trace in the oceans 
(Fig. 5.2-7), 1 provides the best example of intraplate earth¬ 
quakes associated with hot spot processes (Fig. 5.5-3). Small 
earthquakes are associated with magma upwelling in the rift 
zones. Larger earthquakes, which occur on a time scale of tens 
of years, reflect sliding of the volcanic edifice on subhorizontal 
faults that are thought to be a layer of weak sediments at the 
top of the old oceanic crust on which the volcanic island 
formed. These earthquakes can be quite large — the 1975 

1 Numerical models that infer the amount of upwelling mantle material from how 
elevated the sea floor is relative to the normal depth-age curves estimate that Hawaii 
has a buoyancy flux 5-10 times greater than that of Bermuda (Sleep, 1990). 
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Fig. 5.5-3 Schematic model for large intraplate earthquakes below the 
island of Hawaii. Small earthquakes are associated with magma upwelling 
in the rift zones. Larger earthquakes, at dates shown, reflect sliding of the 
volcanic edifice on subhorizontal faults. The portion of the basal fault 
that has not ruptured in historic time may be a seismic gap. (Wyss and 
Koyanagi, 1992. © Seismological Society of America. All rights reserved.) 


Kalapana earthquake had M $ 7.2, caused a tsunami that 
killed two campers on the seashore, and did considerable 
property damage. The earthquake was followed by a small 
volcanic eruption near the summit of Kilauea, perhaps because 
the ground shaking triggered an eruption of shallow magma. 
Curiously, some earthquakes occur to considerable depths 
under Hawaii, including a magnitude 6.2 earthquake at 48 km 
depth. 

Although many oceanic intraplate earthquakes are associated 
with tectonic features, some appear to occur far from plate 
boundaries, hot spots, or major bathymetric features. Thus the 
stresses generated by plate driving forces and other sources, 
including mantle flow near hot spots, appear to reactivate weak 
zones in the plate resulting from small-scale structure acquired 
during the lithosphere’s evolution. 2 

These earthquakes can be dramatic. For example, the 
enormous (M w 8.2) intraplate earthquake that occurred near 
the Balleny Islands in an oceanic part of the Antarctic plate 
(63°S, 149°E) in March 1998 was the largest earthquake that 
had occurred on earth for several years. The fault inferred from 
waveform modeling (Section 4.3) followed no observable linea¬ 
ments and cut straight across existing fracture zones. More¬ 
over, in the previous hundred years, no other earthquakes 
had been located in this region. It is not clear what caused the 
earthquake or whether this area has any special properties or 
stress acting there. Although the earthquake occurred south of 
a puzzling hypothesized deformation zone in the extreme 
southeast corner of the Australian plate (Fig. 5.2-4), its fault 
plane solution is inconsistent with its being on the boundaries 
of a microplate. It is thus unclear whether this area is now any 

2 This situation is analogous to timbers creaking as a wooden boat rocks in the 
waves. 



more prone to future earthquakes than other areas, and 
what the recurrence time of such earthquakes might be. Sim¬ 
ilar issues arise in considering the intraplate seismicity and 
associated seismic hazard in the more structurally complex 
continents. 

Oceanic intraplate seismicity often occurs in swarms. 
Regions without previously known seismicity sometimes be¬ 
come active for several years, with hundreds of teleseismically 
located earthquakes. 3 The seismicity then dies out, and seems 
not to recur. For example, during 1981-3, an intraplate earth¬ 
quake swarm occurred near the Gilbert Islands in Micronesia. 
A total of 225 earthquakes were detected, mostly over a 15 
month period, with 87 above m b 5. No major tectonic features 
are known in this area, and a ship survey found no bathymetric 
anomalies. Before and after the swarm, no other earthquakes 
have been recorded in this region. The swarms thus differ from 
plate boundary seismicity, which occurs on features that re¬ 
main active for long periods even if there are intervening quiet 
intervals. Moreover, the intraplate swarms often appear not 
to have a single well-developed fault, and no event is signific¬ 
antly bigger than the others. By contrast, plate boundary 
earthquakes usually have one or two main ruptures and many 
aftershocks, perhaps reflecting local adjustments to the stress 
field after the mainshock has ruptured the entire fault. 

These swarms raise an interesting issue. We can assume that 
these areas are analogous to plate boundaries in having special, 
if not yet understood, tectonic significance. If so, they are likely 
to be the sites of future swarms. Alternatively, perhaps all areas 
of oceanic lithosphere are equally susceptible to such swarms. 
In this case, over time, swarms will occur in many places, and 
future swarms are no more likely in one place than another. We 
will see that similar issues surface in trying to estimate seismic 
hazards due to intraplate earthquakes within continents. 

5 . 5.2 Forces and stresses in the oceanic lithosphere 

In addition to using oceanic intraplate seismicity to investigate 
the specific processes acting at individual sites, we study the 
seismicity to learn about plate-wide processes. For example, 
Fig. 5.5-4 shows the variation of mechanism type with 
lithospheric age. Most of the oceanic lithosphere seems to be in 
horizontal deviatoric compression, as shown by thrust and 
strike-slip mechanisms. This compression is in approximately 
the spreading direction, and is thought to be related to “ridge 
push”: the plate driving force due to lithospheric cooling and 
subsidence. The major exceptions are the extensional events 
occurring in the central Indian Ocean. Although originally re¬ 
garded as intraplate, these earthquakes now appear to be in 
a diffuse plate boundary zone (Section 5.2.2). In the model 
shown, the focal mechanisms (Fig. 5.5-5) reflect counterclock¬ 
wise rotation of Australia with respect to India, causing normal 
fault earthquakes in the young lithosphere near the Euler pole 

3 There may be many more smaller earthquakes associated with these swarms, but 
because the swarms often occur in remote regions, only the larger events are detected. 
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Fig. 5.5-4 Focal mechanism type as a function of lithospheric age 
for oceanic intraplate earthquakes. Older oceanic lithosphere is in 
compression, whereas younger lithosphere has both extensional and 
compressional mechanisms. Extensional events are located primarily in 
the central Indian Ocean. (Wiens and Stein, 1984./. Geophys. Res., 89, 
11,442-64, copyright by the American Geophysical Union.) 


and thrust and strike-slip earthquakes to the east. These earth¬ 
quakes reach magnitude 7 on the Ninetyeast ridge. 4 

The general trend of compressive mechanisms in the oceanic 
plates is consistent with the plate driving force due to the cool¬ 
ing of the oceanic lithosphere. Consider a plate, defined as the 
area above the m{t) isotherm, out to age t , where the water 
depth is h{t) (Fig. 5.5-6). The plate is cooler, and thus denser, 
than material below. The thermal model we used for ocean 
depth and heat flow also predicts the resulting force. 

The total horizontal force on the base of the lithosphere, F v 
equals the integrated horizontal pressure force of the astheno- 
sphere at the ridge, because the material is in hydrostatic 
equilibrium: 


m(t) 



p m gzdz = p m g(m(t)) 2 /2. 


( 1 ) 


Similarly, F 2 , the horizontal force due to water pressure on the 
plate, equals the integrated horizontal pressure force of the 
water, 

4 Although hot spot tracks like the Ninetyeast and Chagos-Laccadive ridges have 
been termed “aseismic” ridges, to distinguish them from spreading ridges, these two 
are more seismically active in terms of moment release than many spreading ridges. 



Fig. 5.5-5 Schematic map of earthquake mechanisms in the central 
Indian Ocean, shown here as a diffuse boundary zone (shaded) between 
the Indian and Australian plates. Later studies have refined the location 
and geometry of the boundary zone (Fig. 5.2-4) and pole (triangle) (Wiens 
etal., 1985. Geophys, Res. Lett., 12, 429-32, copyright by the American 
Geophysical Union.) 
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Fig. 5.5-6 Derivation of the “ridge push” force. 


b(t) 


?2 = 


P w gzdz = p w g(h(t)) 2 /2. 


0 


( 2 ) 


F 3 is the remaining horizontal force due to lithospheric pressure 
P(z, t), 


m(t) 


A = 


P{z, t)gzdz , 


( 3 ) 


h(t) 

where the pressure depends on the density perturbation due to 
lithospheric cooling (Eqn 5.3.7), 


2 


P{z,t) = p w gh{t)+g 


[p m + p'{z% t)]dz'. 


h(t) 


( 4 ) 
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Fig. 5.5-7 Geometry for a simple model of intraplate stresses. 


If the plate is not accelerating, the force difference is 
balanced by a net horizontal force 

F R = F 1 -~F 2 -Fy (5) 

For the cooling halfspace temperature structure (Eqn 5.3.2), 
this force is 


Integrating first with respect to x and then with respect to z 
from z = 0 to the base of the lithosphere m(x) yields the force 
balance 


m(x) 


( 8 ) 


Here the stress in the spreading direction is given by its vertical 
average <J xx (x); o r = o xx (0) characterizes the strength of the 
ridge; the drag force at the base of the plate is given by the basal 
shear stress o b ; and F R (x) is the net ridge push force 

m(t) x 

Fr(x) = 


f(x, z)dxdz. 


(9) 


o o 


F R =gccp m T m Kt, (6) 

whereas for a plate model it approaches a constant value 
for old lithosphere. The convention of calling this force “ridge 
push” is confusing because it is zero at the ridge and increases 
linearly with plate age. It results not from force at the ridge but 
from the total force due to the density anomaly within the 
cooling plate out to any given age. 

The expression for the “ridge push” force is similar to that 
for the slab pull” force (Eqn 5.4.15) because both are thermal 
buoyancy forces due to the density contrast resulting from the 
temperature difference between the plate and its surroundings. 
The two depend in the same way on the gap m T m term that 
describes the force due to the density contrast, but differently 
on k because faster cooling increases ridge push whereas faster 
heating decreases slab pull. Although it is useful to think of 
the forces separately, both are net buoyancy forces due to the 
mantle convection system of which the plates are a partA 

To discuss the stresses within the oceanic lithosphere, we 
compare the ridge push force to the other forces applied at the 
boundaries of the plate. These include forces at the plate base 
and forces at the subduction zone. As for the downgoing slab, 
earthquake focal mechanisms constrain the relative size of the 
forces. Here, we use the observation (Fig. 5.5-4) that stress in 
the spreading direction is typically compressive at all ages. 

Consider a simple model of stress in the oceanic lithosphere, 
using the geometry of Fig. 5.5-7. Using the stress equilibrium 
equation (Eqn 2.3.49) in the spreading (x) direction, we relate 
the deviatoric stresses to the body force f{x, z), which is the 
contribution to ridge push from the material at (x, z), 


dcr xx(x,z) f d<r„(x,z) 

rJx f) z 


+ f(x, z) = 0. 


(7) 


Verhoogen (1980) offers the analogy that rain occurs because of the negative 
buoyancy of the drops relative to the surrounding air, as part of the process by which 
solar heat evaporates water which rises as vapor due to positive buoyancy and is 
transported by wind to the point where it cools, condenses into drops, and then falls. 


Written in terms of plate age, ?, 

t - 1 
m(t) 




( 10 ) 


where v is a half spreading rate, assumed constant. A useful 
form for comparing different plates comes from the usual 
assumption that the basal drag force equals the product of 
absolute velocity u and drag coefficient C(a b = Cm), 




Cuvt - F R (t) 
m(t) 


+ cr 


( 11 ) 


Thus a drag depending on absolute velocity is applied over 
an area proportional to the spreading rate. For simplicity, we 
assume that v~u, spreading rate equals absolute velocity (the 
ridge is fixed with respect to the mantle), so the net drag force is 
proportional to velocity squared. 

A subduction zone would provide a boundary condition on 
the oldest lithosphere. For example, if focal mechanisms in the 
ithosphere near trenches were extensional, an extensional con¬ 
dition could be imposed. Because such mechanisms are not 
seen, it is often assumed that the negative buoyancy of slabs 
(slab pull) is balanced by local resistive forces (Section 5.4.2). 
Thus, although the ridge push force is probably smaller than 
the slab pull forces, the thrust fault mechanisms suggest that it 
is more crucial for determining stress in oceanic lithosphere. 

Although this stress model is schematic and does not 
describe any individual plate, it lets us use focal mechanism 
observations to estimate several important quantities. Fig¬ 
ure 5.5-8 shows the predicted intraplate stress as a function 
of plate age and drag^ coefficient. For zero drag the stress is 
purely compressive (<J xx < 0) and varies as V*, because the 
force increases linearly with age, whereas the plate thickens as 
its square root. For larger drag coefficients, a xx follows Jt 
curves corresponding to less and less compression, until the 
lithosphere Is in extension for all ages. All lithospheric plates 



20 
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Fig. 5.5-8 Intraplate stress in the spreading direction as a function of 
lithospheric age and assumed basal drag coefficient for slow-moving 
(1 cm/yr, top) and fast-moving (10 cm/yr, bottom) plates. The 
compressional stresses in old oceanic lithosphere place an upper bound 
on the drag coefficient of 4 MPa/(m/yr). (Wiens and Stein, 1985. 
Tectonophysics, 116, 143-62, with permission from Elsevier Science.) 


appear to be in compression, so a rapidly moving plate (such as 
the Pacific, which moves at about 10 cm/yr) constrains the drag 
coefficient to less than about 4MPa/(m/yr). Similar results 
emerge for a cooling plate model. 

This model assumes a zero stress boundary condition at the 
ridge axis, so the axis has no tensile strength. The predicted 
stress in young lithosphere, especially the location of a possible 
transition from compression to extension in the direction of 
spreading, would be sensitive to the strength of the ridge 
(Fig. 5.5-9). Models with substantial strength at the axis pre¬ 
dict a wide band of extension in the spreading direction. Since 
such a zone of normal-faulting earthquakes is not observed, 
the axis seems weak. 


Although this simple model describes only a hypothetical 
average plate, more sophisticated models use realistic plate 
geometries to calculate the stresses expected from ridge push, 
slab pull, and basal drag forces. These models’ predictions can 
be compared to earthquake focal mechanisms and other data 
for specific areas. For example, Fig. 5.5-10 shows stresses 
predicted for the Indian Ocean region. Although the model 
was calculated assuming a single Indo-Australian plate, it pre¬ 
dicts stresses in the region now considered a diffuse boundary 



Age (Myr) 

Fig. 5.5-9 Intraplate stress in the spreading direction as a function of 
lithospheric age computed for several values of ridge strength. The age of 
the transition from ridge-normal extension to compression increases with 
the strength of the ridge. (Wiens and Stein, 1984./. Geophys. Res., 89, 11, 
442-64, copyright by the American Geophysical Union.) 


zone (Fig. 5.5-5) that are generally consistent with the focal 
mechanisms and the folding seen in gravity and seismic reflec¬ 
tion data. 

5.5.3 Constraints on mantle viscosity 

The last section’s analysis relating earthquake mechanisms 
to drag at the base of the lithosphere also gives insight into the 
viscosity of the mantle. The viscosity, 6 the proportionality 
constant between shear stress and the strain rate (or velocity 
gradient), controls how the mantle flows in response to applied 
stress, and is thus crucial for mantle convection. If the drag on 
the base of a plate is due to motion over the viscous mantle, 
compressive earthquake mechanisms in old lithosphere con¬ 
strain the viscosity. 

Consider a simple two-dimensional geometry where mass 
flux due to the moving plate is balanced by a return flow at 
depth (Fig. 5.5-11, top). The drag coefficient is proportional to 
the viscosity and inversely proportional to the flow depth. Fig¬ 
ure 5.5-12 shows that the basal drag constraint from the focal 
mechanism data, C < 4 MPa/(m/yr), requires an average mantle 
viscosity less than 2 x 10 20 poise if flow occurs to a depth of 
700 km in the upper mantle, or 10 21 poise if flow occurs in the 
entire mantle. These values are lower than the 1-5 x 10 22 poise 
typically estimated from glacial rebound, earth rotation, and 
satellite orbits. 

This discrepancy can be reconciled by assuming that the plate 
is underlain by a thin, low-viscosity asthenosphere (Fig. 5.5-11, 
bottom). The low-viscosity layer, in which only a fraction of the 
return flow occurs, decouples the plates from the underlying 

6 Viscosity, defined in Section 5.7, is given in cgs units as poise (dyn-s/cm 2 ) or in SI 
units as Pascal-seconds (1 poise = 0.1 Pa-s). 







Fig. 5.5-10 Intraplate stress predicted by a 
force model for the Indo-Australian plate 
The bars show the principal horizontal 
deviatoric stresses, with arrowheads 
marking tension. The location and 
orientation of the highest stresses, such as 
the transition between compression and 
tension, are generally consistent with 
earthquake mechanisms in the region now 
regarded as a diffuse plate boundary 
{Fig. 5.5-5). (Cloetingh and Wortel, 1985. 
Geophys . Res . Lett, 12 , 77-80, copyright 
by the American Geophysical Union.) 



Fig. 5.5-11 Top : Velocity profile associated with a return flow of uniform- 
viscosity asthenosphere that balances the mass flux due to plate motions. 
Bottom: Velocity profile associated with a return flow of two layers of 
different viscosity. The upper, low-viscosity layer decouples the plates 
from the underlying mantle. (McKenzie and Richter, 1978.) 
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Fig. 5.5-12 Basal drag coefficients as a function of the mantle viscosity 
and flow depth, assuming single-layer flow. (Wiens and Stein, 1985. 
Tectonophysics, 116 ,143-62, with permission from Elsevier Science.) 
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Fig. 5.6-1 Schematic illustration of the 
Wilson cycle, the fundamental geological 
process controlling the evolution of the 
continents, (a)-(b): A continent rifts, such 
that the crust stretches, faults, and subsides, 
(c): Sea floor spreading begins, forming a new 
ocean basin, (d): The ocean widens and is 
flanked by sedimented passive margins, (e): 
Subduction of oceanic lithosphere begins on 
one of the passive margins, closing the ocean 
basin (f) and starting continental mountain 
building, (g): The ocean basin is destroyed by 
a continental collision, which completes the 
mountain building process. At some later 
time, continental rifting begins again. 



mantle. Viscosity values that satisfy the focal mechanisms are 
consistent with constraints from gravity and glacial isostasy, 
and such decoupling is consistent with the lack of correlation 
between oceanic plate area and absolute velocity (Fig. 5.4-12). 

5.6 Continental earthquakes and tectonics 

Although the basic relationships between plate boundaries, 
plate interiors, and earthquakes apply to continental as well as 
oceanic lithosphere, the continents are more complicated. The 
continental crust is much thicker, less dense, and has different 
mechanical properties from the oceanic crust. As a result, plate 
boundaries in continental lithosphere are generally broader and 
more complicated than in the oceanic lithosphere (Fig. 5.2-4). 

Studies of continental plate boundaries, which rely heavily 
on seismology, provide important insights into the funda¬ 


mental geological processes controlling the evolution of the 
continents. The basic process, known as the Wilson cycle, 1 is 
illustrated in Fig. 5.6-1. A continental region undergoes 
extension, such that the crust is stretched, faulted, and sub¬ 
sides, yielding a rift valley like the present East African rift. 
Because the uppermost mantle participates in the stretching, 
hotter mantle material upwells, causing partial melting and 
basaltic volcanism. Sometimes the extension stops after only a 
few tens of kilometers, leaving a failed or fossil rift such as the 
1.2 billion-year-old mid-continent rift in the central USA. In 
other cases the extension continues, so the continental rift 
evolves into an oceanic spreading center (identifiable from sea 
floor magnetic anomalies), which forms a new ocean basin like 

1 Named after J. Tuzo Wilson (1908-93), whose key role in developing plate tec¬ 
tonic theory included introducing the ideas of transform faults, hot spots, and that the 
Atlantic had closed and then reopened. 
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Fig. 5.6-2 Seismicity and focal mechanisms 
(Taxes shown by black arrows) for the East 
African rift system, with relative plate 
motions (white arrows) from Chu and 
Gordon (1998,1999). 



Fig. 5.6-3 Variation in motion of space-geodetic 
sites across part of the Pacific-North America 
boundary zone. Right: Fforizontal velocities of 
sites in California, Nevada, and Arizona relative 
to stable North America. The velocity of the 
southwesternmost site nearly equals the predicted 
48 mm/yr velocity of the Pacific plate relative to 
the North American plate. Left: Component of 
motion tangent to small circles centered on the 
Pacific-North America Euler pole versus angular 
distance from that pole. Velocities increase with 
distance from the Euler pole, with a discontinuity 
due to the approximately 35 mm/yr of time- 
averaged slip across the San Andreas fault. 
(Gordon and Stein, 1992. Science, 256, 333-42, 
copyright 1992 American Association for the 
Advancement of Science.) 
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Fig. 5.6-4 Schematic illustration of the distribution of motion in space 
and time for a strike-slip boundary zone between two major plates. 

(Stein, 1993. Contributions of Space Geodesy to Geodynamics , 5-20, 
copyright by the American Geophysical Union.) 

on the main boundary fault, can be more damaging than larger 
but more distant ones on the main fault. Hence the Los Angeles 
area is vulnerable to both nearby earthquakes like the 1994 
Northridge (M w 6.7) or 1971 San Fernando (M s 6.6) earth¬ 
quakes and larger ones on the more distant San Andreas Fault, 
such as a recurrence of the 1857 Fort Tejon earthquake which 
is estimated to have had M w about 8. Similarly, the earthquake 
hazard in the Seattle area involves both great earthquakes at 
the subduction interface and smaller, but closer, earthquakes 
in the subducting Juan de Fuca plate (like the 2001 M w 6.7 


Nisqually earthquake) or at shallow depth in the North 
American plate. 

Of the three boundary types, continental convergence zones 
may be the most complicated compared to their oceanic coun¬ 
terparts. One primary difference is that because continental 
crust is much less dense than the upper mantle, it is not 
subducted, and a Wadati-Benioff zone is not formed. As a 
result, continental convergence zones in general do not have 
intermediate and deep focus earthquakes. However, the plate 
boundary tectonics occur over a broader and more complex 
region than in the oceanic case. 

A spectacular example is the collision between the Indian 
and Eurasian plates. This area is the present type example of 
mountain building by continental collision, which has pro¬ 
duced a boundary zone extending thousands of km north¬ 
ward from the nominal plate boundary at the Himalayan front 
(Fig. 5.6-5). The total plate convergence is taken up in several 
ways. About half of the convergence occurs across the locked 
Himalayan frontal faults such as the Main Central Thrust 
(Fig. 5.6-6), and gives rise to large destructive earthquakes. 
These faults are part of the interface associated with the under¬ 
thrusting Indian continental crust, which thickens the crust 
under the high Himalayas. However, the earthquakes also show 
normal faulting behind the convergent zone, in the Tibetan 
plateau, presumably because the uplifted and thickened 
crust spreads under its own weight. GPS data (Fig. 5.6-5) show 
that this extension is part of a large-scale process of crustal 
“escape,” or “extrusion,” in which large fragments of con¬ 
tinental crust are displaced eastward by the collision along 



Fig. 5.6-5 Summary of crustal motions 
determined using space geodesy in the 
India-Eurasia plate collision zone. Large 
arrows indicate velocities relative to 
Eurasia. Arrows in circles show velocities 
with no significant motion with respect 
to Eurasia. Small arrows show local 
relative deformation. (Larson etaL, 1999. 
/. Geophys. Res., 104 ,1077-94, copyright 
by the American Geophysical Union.) 
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Fig. 5.6-6 Focal mechanisms and tectonic 
interpretation for the Himalayan continental 
convergence zone. MCT and MBT are the 
Main Central and Main Boundary thrust 
faults. (After Ni and Barazangi, 1984. 

J. Geopbys. Res., 89, 1147-64, copyright 
by the American Geophysical Union.) 
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Fig. 5.6-7 Demonstration of the deformation of Asia, modeled by a striped block of plasticine, as the result of a collision with a rigid block simulating the 
Indian subcontinent. The plasticine is constrained on the left side, so the impact forces blocks to be extruded to the right, analogous to the eastward motion 
of blocks in Indochina and China. (Tapponnier etal, 1982. Geology, 10, 611-16, with permission of the publisher, the Geological Society of America, 
Boulder, Co. © 1982 Geological Society of America.) 


major strike-slip faults. This extrusion has been modeled assum¬ 
ing that India acts as a rigid block indenting a semi-infinite 
plastic medium (Asia), giving rise to a complicated faulting and 
slip pattern (Fig. 5.6-7). The extent of the collision is illustrated 
by GPS data and focal mechanisms showing that the Tien Shan 


intracontinental mountain belt, 1000-2000 km north of the 
Himalayas, accommodates almost half the net plate conver¬ 
gence in the western part of the zone. 

In addition to providing data about a collision region’s 
kinematics, seismological studies provide insight into its mech- 
































Fig. 5.6-8 GPS observations of motions relative to Eurasia (a), focal mechanisms (b), and tectonic interpretation (c) for a portion of the Africa-Arabia- 
Eurasia plate collision zone. Note strike-slip along the North Anatolian fault, extension in western Anatolia and the Aegean region, and compression in the 
Caucasus mountains. Rates are in mm/yr. (McClusky etal., 2000./. Geophys. Res., 105 , 5695-5719, copyright by the American Geophysical Union.) 


anics. The collision process is thought to involve a complex 
interplay between forces due directly to the collision, gravita¬ 
tional forces due to the resulting uplift and crustal thickening, 
and forces from the resulting mantle flow. Earthquake depths 
and studies of seismic velocity, attenuation, and anisotropy are 
providing data on crustal thicknesses, thermal and mechanical 
structures, and mantle flow. For example, P -wave travel time 
tomography shows high velocity under the presumably cold 
Himalayas, which contrasts with low velocity under Tibet. 
These and other seismological data are consistent with the idea 
that Tibet deforms easily during the collision. 

An equally complicated situation occurs in the eastern Medi¬ 
terranean collision zone involving the African, Arabian, and 
Eurasian plates. Combining GPS and focal mechanism data 
shows the complex motions. Figure 5.6-8 (a) shows the motions 
of sites in the western Mediterranean relative to Eurasia. 
Northern portions of Arabia move approximately N40°W, 
consistent with global plate motion models. Western Turkey 
rotates as the Anatolian plate about a pole near the Sinai pen¬ 
insula. Anatolia is thus “squeezed” westward between Eurasia 
and northward-moving Arabia (Fig. 5.6-8, c). 2 The motion 
across the North Anatolian fault, about 25 mm/yr, gives rise to 

2 Consider a melon seed squeezed between a thumb and a forefinger. 


large right-lateral strike-slip earthquakes (Fig. 5.6-8, b) such 
as the 1999 M s 7.4 Izmit earthquake, which occurred about 
100 km east of Istanbul and caused more than 30,000 deaths. 
To the west, the data show interesting deviations from a rigid 
Anatolian plate. The increasing velocities toward the Hellenic 
trench, where the Africa plate subducts below Crete and 
Greece, show that western Anatolia and the Aegean region are 
under extension, consistent with the normal fault mechanisms. 
This region may be being “pulled” toward the arc, perhaps by 
an extensional process similar to oceanic back-arc spreading, 
as the trench “rolls back” (Section 5.2.4). By contrast, eastern 
Turkey is being driven north-ward into Eurasia, causing 
compression that appears as the thrust fault earthquakes in 
the Caucasus mountains. The Dead Sea transform separates 
Arabia from the region to the west, sometimes viewed as 
the Sinai microplate. Strike-slip motion along this fault gives 
rise to the earthquakes mentioned in the Bible that repeatedly 
destroyed famous cities like Jericho. 

5.6.2 Seismic, aseismic, transient, and permanent 
deformation 

The examples in the previous section illustrate that earthquakes 
give powerful insights into the crustal deformation shaping the 
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Fig. 5.6-9 Schematic illustration of how crustal deformation on various 
time scales is observed by different techniques. 

continents. Other approaches to studying this deformation, 
including various geodetic and geological means, sample the 
deformation in different ways on various time scales (Fig. 5.6- 
9). Flence, considerable attention goes into understanding how 
what we see with these different techniques are related. For 
example, as discussed earlier (Sections 4.5.4, 5.4.3), in many 
places only part of the plate motion seems to occur as earth¬ 
quakes, and the rest takes place as aseismic slip. A related ques¬ 
tion is how the deformation shown by earthquakes, which has 
a time scale of a few years, is related to the longer-term deforma¬ 
tion that is recorded by topography and the geologic record. 

To explore these ideas, consider the distribution of motion 
within the boundary zone extending from the stable interior 
of the oceanic Nazca plate, across the Peru-Chile trench to 
the coastal forearc, across the high Altiplano and foreland 
thrust belt, and into the stable interior of the South American 
continent. Figure 5.6-10 shows GPS site velocities relative to 
stable South America, which would be zero if the South Amer¬ 
ican plate were rigid and all motion occurred at the trench plate 
boundary. However, the site velocities are highest near the 
coast and decrease relatively smoothly from the interior of the 
Nazca plate to the interior of South America. 

Figure 5.6-10 (bottom) shows an interpretation of these 
data. In this model, about half of the plate convergence 
(approximately 35 mm/yr) is locked at the subduction inter¬ 
face, causing elastic strain of the overriding plate that will be 
released in large interplate thrust earthquakes (Section 4.5.4) 
like those whose focal mechanisms are shown. Thus the locked 
fraction of the plate motion corresponds to the seismic slip rate, 
perhaps via a process in which only a fraction of the interface 
is locked at any time. Approximately 20 mm/yr of the plate 
motion occurs by stable sliding at the trench, which does not 




Fig. 5.6-10 Top: GPS site velocities relative to stable South America 
(Norabuena etai, 1998. Science, 279 , 358-62, copyright 1998 American 
Association for the Advancement of Science), and selected earthquake 
mechanisms in the boundary zone. Rate scale is given by the NUVEL-1A 
vector. Bottom: Cross-section showing approximate velocity distribution 
inferred from GPS data. (Stein and Klosko, 2002. From The Encyclopedia 
of Physical Science and Technology, ed. R. A. Meyers, copyright 2002 by 
Academic Press, reproduced by permission of the publisher.) 

deform the overriding plate. This portion of the plate motion 
corresponds to aseismic slip. The rest occurs across the sub- 
Andean foreland fold-and-thrust belt, causing permanent 
shortening and mountain building, as shown by the inland 
thrust fault mechanisms. This portion of the plate motion 
would be considered aseismic slip if we considered only the 
fraction of the plate motion that appears in the trench seismic 
moment release, whereas in reality it occurs as inland deforma¬ 
tion. These interpretations come from analyzing the GPS data 
in the convergence direction relative to the stable interior of 
South America (Fig. 5.6-11). If all the convergence were locked 
on the interplate thrust fault, the predicted rates would exceed 
those observed within about 200 km of the trench. However, if 
only about half of the predicted convergence goes into locking 
the fault, the predicted rates near the trench are less, because 
only the portion of the slip locked at the interface deforms the 
overriding plate. Similarly, the data farther than about 300 km 
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Fig. 5.6-11 Derivation of the model 
in Fig. 5.6-10 {bottom). Top: Model 
geometry, assuming partial slip locked at 
the plate boundary and shortening in the 
eastern Andes. Center: GPS site velocities 
in the convergence direction and various 
models, given by the rates of locked 
slip and shortening. Solid line shows 
predictions of best-fitting model, 
including both partial slip locked at the 
plate boundary and shortening in the 
eastern Andes. Short dashed line shows 
predictions of model with all slip locked 
on the plate boundary and no shortening. 
Long-short dashed line shows predictions 
of model with no shortening and partial 
slip locked on the plate boundary equal to 
the sum of best-fitting slip and shortening. 
Bottom: Contour plot showing misfit to 
the data as a function of the slip rate locked 
on the plate boundary and shortening rate 
in the eastern Andes. The best fits (dots) 
occur for about 30-40 mm/yr of locking 
and about 10-20 mm/yr shortening. 
(Norabuena etal., 1998. Science, 279 , 
358-62, copyright 1998 American 
Association for the Advancement of 
Science.) 



Normalized misfit 




Slip rate locked on interplate thrust (mm/yr) 


from the trench are better fit by assuming that about 10 mm/yr 
motion is locked on thrust faults in the eastern Andes. The lock¬ 
ing and shortening rates are the best-fit parameters for this sim¬ 
ple model, which does not include other possible complexities 
such as deformation in the Altiplano. 

The idea that about 40% of the plate motion at the trench 
occurs by aseismic slip seems plausible, because studies using 
the history of large earthquakes at trenches often estimate that 
only about half the slip occurs seismically (Fig. 5.4-30). Given 
the problems of estimating source parameters of earthquakes 
from historical data, it is encouraging that the geodetic answer 
seems similar. 

The relation between the shortening rate in the thrust belt 
inferred from GPS data and that implied by the earthquakes 
can also be studied. Assessing the seismic slip rate is a little 
more complicated than for transform faults (Section 5.3.3) or 
subduction zone thrust faulting (Section 5.4.3), because in 
continental deformation zones earthquakes occur over a dis¬ 


tributed volume, rather than on a single fault, and have diverse 
focal mechanisms. Thus we sum the earthquakes’ moment ten¬ 
sors (Section 4.4) to estimate a seismic strain rate tensor 3 using 

^=X Mq/ViiVt), (1) 

where t is the time interval, and /t is the rigidity. V, the assumed 
seismic source volume, the product of the length and width of 
the zone of seismicity and the depth to which seismicity ex¬ 
tends. For example, the thrust belt can be assumed to be ap¬ 
proximately 2000 km long, 250 km wide, and faulting extends 
to about 40 km depth. We can then diagonalize the result and 
consider the eigenvalue associated with the P axis. Scaling this 
value by the assumed zone width gives an estimate of the short¬ 
ening rate. The resulting value, less than 2 mm/yr, is signifi¬ 
cantly less than the approximately 10 mm/yr indicated by the 

3 Strain rates are often written using a dot to indicate the time derivative. 
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Fig. 5.6-12 Comparison of shortening 
across the Andes with respect to stable 
South America from GPS data (left) 
and geological studies (right). The dashed 
GPS vectors reflect elastic strain due to the 
earthquake cycle at the trench, and are not 
directly comparable to the permanent 
shortening in the geological data. Motion 
decreases toward the eastern extent of 
the mountain range, shown by the solid line, 
The geological vectors are largest at about 
18°S and decrease to the north and south, 
showing how the variation in shortening 
that built the Andes bent them and made 
them widest about this point. (Hindle 
etal, 2002.) 


GPS data. Thus, even given the usual problem that the seismic 
history is short and may have missed the largest earthquakes, 
an effect one can attempt to correct for using earthquake 
frequency-magnitude data (Section 4.7.1), it looks like much 
of the shortening occurs aseismically. 

An interesting question is how what we see today with earth¬ 
quakes and GPS data relates to what occurs over geologic time. 
Figure 5.6-12 shows the results of geological studies, in which 
the arrows indicate the deformation that occurred over the 
past 10 Myr as the Andes formed. The directions and rates are 
similar to what are seen today, suggesting that the mountain 
building process has occurred relatively uniformly, although 
there have been some rate changes. 4 

Putting all this together gives some ideas about how the dif¬ 
ferent measures of crustal deformation are related in this area. 
The first issue involves the relative amounts of seismic and 
aseismic deformation. It appears that about half of the plate 
motion at the trench occurs seismically. Similar fractions are 
also seen in other subduction zones (Fig. 5.4-30), implying that 
stable sliding at trenches is relatively common. Moreover, only 
about 10-20% of the shortening in the foreland thrust belt 
appears to occur seismically. Thus aseismic, and presumably 
permanent, deformation of rocks in the thrust belt seems like 
a major phenomenon. Similar results have also been observed 
for other continental deformation zones (Fig. 5.6-13). The next 
issue is that of permanent versus transient deformation. In the 
model of Fig. 5.6-11, the deformation of the South American 
plate due to the locked slip at the trench is transient, and will be 
released in the upcoming large trench earthquake. Flowever, 
it seems likely that the deformation of the foreland thrust belt 
is permanent, and goes into faulting and folding rocks. Over 

4 The similarity of the focal mechanism, GPS, and geological data illustrates the 
principle of uniformitarianism , that studying present processes gives insight into the 
past, a tenet of geology since Lyell and Hutton’s seminal work almost two centuries ago. 


time, this permanent displacement adds up (Fig. 5.6-12) to 
build the mountains. 

Similar studies are going on around the world, and should 
lead to an improved understanding of the partitioning between 
seismic, aseismic, transient, and permanent deformation. Models 
are being developed to explore these issues (Section 5.7), which 
are important both for understanding continental evolution 
and for earthquake hazard assessment, because an apparent 
seismic moment deficit could indicate either overdue earth¬ 
quakes or aseismic deformation. 

5. 6.3 Continental intraplate earthquakes 

Another important application of earthquake studies deals 
with the internal deformation of the continental portions of the 
major plates. Although idealized plates would be purely rigid, 
intraplate earthquakes reflect the important and poorly under¬ 
stood tectonic processes of intraplate deformation. As in the 
oceans (Section 5.5.1), there appears to be a hierarchy of places 
that have such earthquakes. There are areas like the East 
African rift that can be thought of as either slow-moving plate 
boundaries or intraplate deformation, less active zones associ¬ 
ated with either fossil structures or other processes like hot 
spots, and then intraplate earthquakes that are not easily cor¬ 
related with any particular structure or cause. 

One example is the New Madrid area in the central USA, 
which had large earthquakes in 1811-12 and has small earth¬ 
quakes today. Other continental interiors, including Australia, 
western Europe, and India, have also had significant intraplate 
earthquakes. Because motion in these zones is at most a few 
mm/yr, compared to the generally much more rapid plate 
boundary motions, seismicity is much lower (Fig. 5.6-14) and 
thus harder to study. This difficulty is compounded by the fact 
that, unlike at plate boundaries, where plate motions give in¬ 
sight into why and how often earthquakes occur, we have little 
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Fig. 5.6-13 Estimates of seismic deformation fractions for areas in the Mediterranean and Middle East. Seismicity appears to account for most or all of 
the deformation in western Turkey, Iran, and the Aegean, much of the deformation in the Caucasus and eastern Turkey, and little of the deformation in 
the Zagros and the Hellenic trench. (Jackson and McKenzie, 1988.) 


idea of what causes intraplate earthquakes, and no direct way For example, intensities estimated from historical accounts 

to estimate how often they should occur. As a result, progress of the 1811-12 New Madrid earthquakes (Fig. 1.2-4) suggest 

in understanding these earthquakes is much slower than for magnitudes in the low 7 range. Paleoseismic studies (Section 1.2) 

earthquakes on plate boundaries, and key issues may not be indicate that several previous large earthquakes, presumably 

resolved for a very long time. comparable to those of 1811—12, occurred 500—800 years 

Geodetic data Illustrate the challenge. For example, com- apart. Thus, in 500-1000 years (Fig. 5.6-16, top) steady strain 

parison of the absolute velocities of GPS sites in North America accumulation less than 2 mm/yr could provide up to 1-2 m of 

east of the Rocky Mountains to velocities predicted by model- motion available for future earthquakes, suggesting that they 

ing these sites as being on a single rigid plate shows that the would be about magnitude 7. A similar view comes from con- 

interior of the North American plate is rigid at least to the level sidering the earthquake history for the area. As discussed in 

of the average velocity residual, less than 1 mm/yr (Fig. 5.6-15). Section 4.7.1, earthquakes of a given magnitude are approx- 

Similar results emerge from studies across the New Madrid imately ten times less frequent than those one magnitude unit 

zone itself and for the interiors of other major plates, show- smaller. Thus, although the instrumental data contain no 

ing that plates thought to have been rigid on geological time earthquakes with magnitude greater than 5, both these and a 

scales are quite rigid on decadal scales. For example, 1 mm/yr historical catalog in which magnitudes were estimated from 

motion spread over 100 or 1000 km distance corresponds to intensity data can be extrapolated to imply that a magnitude 

strain rates of 10“ 8 and 10" 9 yr" 1 (3 x 10~ 16 and 3 x 10" 17 s -1 ), 7 earthquake would occur about once every 1400 ± 600 years 

respectively. Because the geodetic data Include measurement (Fig. 5.6-16, bottom ). Hence, as expected, major intracon- 

errors due to effects including instabilities of the geodetic tinental earthquakes occur substantially less frequently than 

markers, it seems likely that the tectonic strains are even smal- comparable plate boundary events (Fig. 5.6-17). However, 

ler. However, over long enough time, even such small motions because of the lower attenuation in continental interiors (Sec- 

can accumulate enough slip for large earthquakes to occur. tion 3.7.10), such earthquakes can cause greater shaking than 

This idea is consistent with what is known about large ones of the same magnitude on a plate boundary (Fig. 1.2-5). 

intraplate earthquakes. Although there is little seismological Such earthquakes are generally thought to be due to the 
data for such events because they are rare, insight can be reactivation of preexisting faults or weak zones in response 

obtained from combining the seismological data with geodetic, to either local or intraplate stresses. The New Madrid earth- 

paleoseismological, and other geological and geophysical data. quakes, for example, are thought to occur on faults associated 
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Fig. 5.6-14 Seismicity (magnitude 5 or greater since 1965) of the continental portion of the North American plate and adjacent area. Seismicity and 
deformation are concentrated along the Pacific-North America plate boundary zone, reflecting the relative plate motion. The remaining eastern portion 
of the continent, approximately that east of 260°, is much less seismically active. Within this relatively stable portion of the continent, seismicity, and thus 
presumably deformation, are concentrated in several zones, most notably the New Madrid seismic zone. (Weber et al., 1998. Tectonics, 1 7,250-66, 
copyright by the American Geophysical Union.) 


with a Paleozoic failed continental rift, now buried beneath 
thick sediments deposited by the Mississippi river and its 
ancestors (Fig. 5.6-18). As a result, the faults are not exposed at 
the surface, so most ideas about them are based on inferences 
from seismology and other data. The intraplate stress field 
has been studied by combining focal mechanism and fault 
orientations with data from drill holes and in situ stress 
measurements (Fig. 5.6-19). In general, the eastern USA shows 
a maximum horizontal stress oriented NE-SW, consistent with 
the predictions of the stresses due to plate driving forces. 
Similar stress maps are being developed for other areas and are 
being used to investigate both intraplate deformation and plate 
driving forces. As noted in Section 3.6.5, it appears that seismic 
anisotropy in the lower continental crust may reflect the stress 
field that acted during a major tectonic event such as mountain 
building. 


An intriguing question is why intraplate stresses cause earth¬ 
quakes on particular faults, given that many weak zones could 
serve this purpose. Geological and paleoseismic data, together 
with the absence of significant fault-related topography, sug¬ 
gest that individual intraplate seismic zones may be active for 
only a few thousands of years, so intraplate seismicity migrates. 
This possibility is akin to that suggested for intermittent 
oceanic intraplate earthquake swarms. If so, there is nothing 
special about New Madrid or the other concentrations of 
intraplate seismicity we observe now — these zones will die off 
and be replaced by others. Moreover, there are enough tectonic 
structures available that (typically small) earthquakes will 
occur almost randomly throughout continental interiors. 

A special case of this phenomenon occurs at passive con¬ 
tinental margins, where continental and oceanic lithospheres 
join. Although these areas are in general tectonically inactive, 
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Fig. 5.6-15 Locations of continuously recording GPS sites used to estimate a Euler vector for the presumably stable portion of North America. For each, 
the misfit between the observed velocity and that predicted for a single plate is shown. The average misfit is less than 1 mm/yr, showing that eastern North 
America is quite rigid. (Newman etal, 1999. Science, 284, 619-21, copyright 1999 American Association for the Advancement of Science.) 

magnitude 7 earthquakes can occur, as on the eastern coast of 27 fatalities. 5 An interesting unresolved question is whether 

North America (Fig. 5.6-20). Such earthquakes may be associ- tectonic faulting is required for such earthquakes, or whether 

ated with stresses, including those due to the removal of glacial the slump itself can account for what is seen on seismograms, 

loads, which reactivate faults remaining from the original Some studies find that the seismograms are best fit by a double¬ 
continental rifting (Fig. 5.6-1). Although such earthquakes couple fault source, whereas others favor a single force consist- 

are observed primarily on previously glaciated margins, they ent with the slump (Fig. 4.4-3). The issue is important because 

also occur on nonglaciated passive margins, perhaps due to slumps occur in the sedimentary record along many passive 

sediment loading. In some cases large sediment slides occur, 

as was noted for the 1929 M s 7.2 earthquake on the Grand 5 These deaths accouM for all but one of Canada . s known earthquake fataIities to 

Banks of Newfoundland, because the slides broke trans- date, although this situation could change after a large Cascadia subduction zone 

Atlantic telephone cables and generated a tsunami that caused earthquake. 





346 Seismology and Plate Tectonics 



Magnitude 

Fig. 5.6-16 Top : Relation between interseismic motion and the expected 
recurrence of large New Madrid earthquakes. The recurrence estimates 
from paleoseismic studies and geodetic data are jointly consistent with 
slip in the 1811-12 earthquakes of about 1 m, corresponding to a low 
magnitude 7 earthquake. Bottom : Earthquake frequency-magnitude 
data for the New Madrid zone. Both the instrumental and historic (18lb- 
1984) data predict a recurrence interval of about 1000 yr for magnitude 
7 earthquakes. (Newman etal., 1999. Science, 284 ,619-21, copyright 
1999 American Association for the Advancement of Science.) 

margins, even those that have not been recently deglaciated. 
Stresses associated with the removal of glacial loads may also 
play a role in causing earthquakes within continental interiors 
such as the northeastern USA and eastern Canada. It has also 
been suggested that the huge 1998 Balleny Island intraplate 


earthquake (Section 5.5.1) may have been triggered by stresses 
due to the shrinking Antarctic ice cap. 

As in the oceans, another interesting class of intraplate 
seismicity is associated with hot spots. The area near the 
Yellowstone hot spot in the western USA shows an intriguing 
pattern of seismicity along the margins of the Snake River plain 
(Fig. 5.6-21), which is the volcanic track the hot spot produced 
as the North American plate moved over it (Fig. 5.2-8). 
This seismicity, which includes the 1959 M s 7.5 Hebgen Lake, 
Montana, 6 and 1983 M 5 7.3 Borah Peak, Idaho, earthquakes, 
forms a parabolic pattern extending southwestward from 
Yellowstone itself. It thus stands out from the regional seis¬ 
micity (Fig. 5.2-3) associated with the extensional tectonics of 
the eastern portion of the Basin and Range province, termed 
the Intermountain Seismic Belt. The absence of seismicity 
along the track itself seems likely to be a consequence of the 
thermal and magmatic perturbations produced by the hot spot, 
although the specific mechanism is still under discussion. 
Seismic tomography (Fig. 5.6-21) shows a low-velocity anomaly 
in the crust and upper mantle under Yellowstone itself, pre¬ 
sumably due to partial melting and hydrothermal fluids, and a 
deeper anomaly that persists along the track. 

In summary, although continental intraplate seismicity is a 
minor fraction of global seismic moment release, it has both 
scientific and societal interest precisely because it is rare. It 
provides one of our few ways of studying the limits of plate 
rigidity and intraplate stresses, and poses the challenge of 
deciding the appropriate level of earthquake preparedness 
for rare, but potentially destructive, earthquakes. 

5.7 Faulting and deformation in the earth 

Because earthquake faulting is a spectacular manifestation of 
the processes that deform the solid earth, we seek to under¬ 
stand how earthquakes result from and reflect this deforma¬ 
tion. Valuable insight comes from laboratory experiments 
and theoretical models for the behavior of solid materials. 
Although the experiments and models are much simpler than 
the complexities of the real earth, they allow us to think about 
key features. Seismology and geophysics thus exploit research 
devoted to material behavior by a range of disciplines, includ¬ 
ing engineering, materials science, and solid state physics. We 
touch only briefly on some basic ideas, and more information 
can be found in the references at the end of the chapter. 

5.7 .1 Rheology 

Materials can be characterized by their rheology , the way 
they deform. In seismology we typically take a continuum 

6 This earthquake triggered an enormous landslide that buried a campground, caus¬ 
ing 28 deaths and dammed the Madison River, forming Quake Lake. These dramatic 
effects are still visible today and make the site well worth visiting. A visitor center and 
parking lot are built on the slide. 
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Fig. 5.6-17 Schematic illustration of the relation between the recurrence times of seismicity and resulting seismic hazard for the intraplate New Madrid 
seismic zone and the southern California plate boundary zone. Seismicity is assumed to be randomly distributed about an N-S line through 0, with 
California 100 times more active, but New Madrid earthquakes causing potentially serious damage (circles show areas with acceleration 0.2 g or 
greater, Table 1.2-4) over an area comparable to that for a California earthquake one magnitude unit larger. 
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Fig. 5.6-18 Schematic tectonic model for the New Madrid earthquakes. 
(Braile etai, 1986. Tectonophysics, 131, 1-21, with permission from 
Elsevier Science.) 


approach, considering the earth to be a continuous deformable 
material. This means that we focus on its aggregate behavior 
(Section 2.3) rather than on how its behavior is determined 
by what happens at a microscopic scale. 

To do this, consider the strain that results from compressing 
a rock specimen. The simplest case is shown in Fig. 5.7-la. For 
small stresses, the resulting strain is proportional to the applied 
stress, so the material is purely elastic. Elastic behavior happens 
when seismic waves pass through rock, because the strains are 
small (Section 2.3.8). However, once the applied stress reaches 
a value cy, known as the rock’s fracture strength , the rock 
suddenly breaks. Such brittle fracture is the simplest model 
for what happens when an earthquake occurs on a fault. Thus 
brittle fracture — a deviation from elasticity — generates elastic 
seismic waves. 

Other materials show a change in the stress-strain curve for 
increasing stresses (Fig. 5.7-lb). For stresses less than the yield 
stress , cr o , the material acts elastically. Thus, if the stress is re¬ 
leased, the strain returns to zero. Fiowever, for stresses greater 
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Fig. 5.6-19 Stress map for North America. (World Stress Map project, 2000. Courtesy of the US Geological Survey.) 


than the yield stress, releasing the stress relieves the elastic 
portion of the strain, but leaves a permanent deformation 
(Fig. 5.7-lc). If the material is restressed, the stress-strain curve 
now includes the point of the permanent strain. The material 
behaves as though its elastic properties were unchanged, but 
the yield strength has increased from o 0 to c' 0 . The portion of 
the stress-strain curve corresponding to stress above the yield 
stress is called plastic deformation, in contrast to the elastic 
region where no permanent deformation occurs. Materials 
showing significant plasticity are called ductile. A common 
approximation is to treat ductile materials as elastic-perfectly 
plastic: stress is proportional to strain below the yield stress 
and constant for all strains when stress exceeds the yield stress 
(Fig. 5.7-2). 

An important result of laboratory experiments is that at low 
pressures rocks are brittle, but at high pressures they behave 
ductilely, or flow. Figure 5.7-3 shows experiments where a 
rock is subjected to a compressive stress o l that exceeds a con¬ 
fining pressure ct 3 . For confining pressures less than about 
400 MPa the material behaves brittlely — it reaches the yield 
strength, then fails. For higher confining pressures the material 


flows ductilely. These pressures occur not far below the earth’s 
surface — as discussed earlier, 3 km depth corresponds to 
100 MPa pressure — so 800 MPa is reached at about 24 km. 
This experimental result is consistent with the idea that the 
strong lithosphere is underlain by the weaker asthenosphere. 

A related phenomenon is that materials behave differently 
at different time scales. A familiar example is that although an 
asphalt driveway is solid if one falls on it, a car parked on it 
during a hot day can sink a little ways into it. On short time 
scales the driveway acts rigidly, but on longer time scales it 
starts to flow as a viscous fluid. This effect is crucial in the 
earth, because the mantle is solid on the time scale needed for 
seismic waves to pass through it, but flows on geological time 
scales. 

5.7.2 Rock fracture and friction 

The first question we address is how and when rocks break. 
In the brittle regime of behavior, the development of faults 
and the initiation of sliding on preexisting faults depend on the 
applied stresses. 
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Fig. 5.6-20 Earthquakes along the passive continental margin of eastern 
Canada. These earthquakes may have occurred on faults remaining from 
continental rifting. (Stein etal., 1979. Geophys. Res. Lett., 6,537-40, 
copyright by the American Geophysical Union.) 


Given a stress field specified by a stress tensor, we use the 
approach of Section 2.3.3 to find the variation in normal and 
shear stress on faults of various orientations. For simplicity, we 
consider the stress in two dimensions. If the coordinate axes 
(e a , e 2 ) are oriented in the principal stress directions, the stress 
tensor is diagonal, 


C7 i; = 


<P 


(i) 


To find the stress on a plane whose normal e\ is at an angle of 0 
from e v the direction of a t (Fig. 5.7-4), we transform the stress 



3 o 0 <7 


Fig. 5.6-21 Top: Seismicity (1900-85) of the Intermountain area of the 
western USA. Superimposed on the regional seismicity are earthquakes 
forming a parabola along the margins of the Yellowstone-Snake River 
plain (YRSP), the volcanic track of the Yellowstone hot spot. Bottom: 

P-wave velocities across the hot spot track, shown by squares scaled in 
size to the differences from a uniform-velocity model. The largest symbols 
are +3%, with dark and open symbols showing low and high velocities. 
(Smith and Braile, 1994./. Volcan. Geotherm. Res., 61, 121-87, with 
permission from Elsevier Science.) 







350 Seismology and Plate Tectonics 





Permanent Recoverable 
strain elastic 
strain 


Fig. 5.7-1 (a): A material is perfectly elastic 
until it fractures when the applied stress 
reaches oy. (b): A material undergoes plastic 
deformation when the stress exceeds a yield 
stress <r o . (c): A permanent strain results 
from plastic deformation when the stress is 
raised to o' and released. 


Perfectly plastic 



Fig. 5.7-2 An elastic-perfectly plastic rheology, which is a commonly used 
approximation for the behavior of ductile materials. 


tensor from the principal axis coordinate system to a new co¬ 
ordinate system using the transformation matrix (Section 2.3.3) 


cos # sin # 
-sin# cos# 


so that the stress in the new (primed) system is 


COS # 

sin #^| 

h 

0 

\ / 

cos # -sin 

-sin # 

cos # J 

U 

°2 

J sin # cos 

CJj cos 2 

# + cr 2 

sin 2 

# 

(<J 2 - of) sin # 


(<r 2 - C7 1 ) sin # cos # o 1 sin 2 # + < j 2 cos 2 


The normal and shear stresses on the plane vary, depending 
on the plane’s orientation. The normal stress component, 
denoted by <r, is 


<7= (j' n = (jj cos 2 #+ <t 2 sin 


2 r\ _ (<7~i + Gf) (P\ ~ 


cos 2#, 
(4a) 


Figure 5.7-4 shows crand ras functions of # for the case of 
o 1 and <7 2 negative ([ cr x | > |cr 2 |), which corresponds to com¬ 
pression at depth in the earth. A graphic way to show these 
is with Mohr's circle , a plot of a versus t (Fig. 5.7-5). Values 
for all different planes lie on a circle centered at a = (cq + ct 2 )/2, 
t= 0, with radius (cr 2 - cqJ/2. The point on the circle with angle 
2#, measured clockwise from the -craxis, gives the cr, rvalues 
on the plane whose normal is at angle #to c^. 1 

Laboratory experiments on rocks under compression show 
that fracture occurs when a critical combination of the absolute 
value of shear stress and the normal stress is exceeded. This 
relation, known as the Coulomb-Mohr failure criterion , can be 
stated as 

\T\ = %~ n °, (5) 

where t o and n are properties of the material known as 
the cohesive strength and coefficient of internal friction . (The 
minus sign reflects the convention that compressional stresses 
are negative.) The failure criterion plots as two lines in the 
T-d plane, with Taxis intercepts ±T 0 and slope ±n (Fig. 5.7-6). 
If the principal stresses are o v <7 2 , such that Mohr’s circle does 
not intersect the failure lines, the material does not fracture. 
However, given the same cr 2 but a higher Mohr’s circle 
intersects the line, and the material breaks. 

The failure lines show how much shear stress, T, can be 
applied to a surface subject to a normal stress <7 before failure 
occurs. The cohesive strength is the minimum (absolute value) 
shear stress for failure. The coefficient of internal friction indic¬ 
ates the additional shear stress sustainable as the normal stress 
increases. Thus, deeper in the crust, where the pressure and 
hence normal stress are higher, rocks are stronger, and higher 
shear stress is required to break them. 

The failure lines and Mohr’s circle show on which plane fail¬ 
ure occurs for a given stress state. To find #, the angle between 
the plane’s normal and the maximum compressive stress (cq) 
direction, we write the failure lines as 


and the shear component, denoted by T, is 


T | = t o - (7 tan 0, 


T= <7^ 2 = (<7 2 - of) sin #cos # = ——— si n 26. 


1 Following the seismological convention of compressive stresses being negative, 
Mohr’s circle is shown for cj < 0. The opposite convention is often used in rock 
mechanics, e.g. Figs. 5.7-3 and 5.7-10. 
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Fig. 5.7-3 Results of an experiment in which rocks are subjected 
to a compressive stress a x greater than the confining pressure cj 3 . 

Top : Differential stress {o x - <r 3 ) versus strain (compare to Figs 5.7-1 
and 2) curves for various confining pressures. Bottom: Ultimate strength 
(Cj - cr 3 at 10% strain rate, from top) for various confining pressures. 

For low (< 400 MPa) confining pressures, the material fractures, and its 
strength increases with pressure. For higher pressures, the material is 
ductile, and its strength increases only slowly with pressure. A semi-brittle 
transition regime, in which both microfractures and crystal plasticity 
occur, separates the brittle and ductile regimes. (Kirby, 1980./. Geophys. 
Res., 85, 6353-63, copyright by the American Geophysical Union.) 


where n = tan 0, and 0, the angle of internal friction , is formed 
by extending the failure line to the o axis (Fig. 5.7-7). Fracture 
occurs at point F, where the failure line is tangent to Mohr’s 
circle. Considering the right triangle AFB, we see that 


5.7 Faulting and deformation in the earth 351 


For example, in introducing the relation between fault plane 
solutions and crustal stresses in Section 2.3.5, we made the 
simplest assumption that fracture occurs at 45° to the principal 
stress axes, corresponding to the case 0 = 0°, n = 0, 6 = 45°. 
Physically, this means that the normal stress has no effect on 
the strength of the rock. However, rocks typically have n about 
1, so (j) = 45°, 0 = 67.5°, and the fault plane is closer (22.5°) to 
the maximum compression (cFj) direction (Fig. 5.7-8). This idea 
is important when using P and T axes of focal mechanisms to 
characterize stress directions. 

Figure 5.7-7 also shows how to find the stresses when frac¬ 
ture occurs. Consider the point T on the failure line such that 
Tct 2 is perpendicular to the <7 axis. Because the angle AT<j 2 is 0 
(triangles AFT and A<7 2 T are congruent), 

T<7 2 = A<j 2 cot 0 , (8) 

or, since A a 2 = (o 2 - G t )/2, 



(9) 


Similarly, 

T<7 2 = t o — (7 2 tan 0 (10) 

(the minus sign is because o 2 is negative), so 


^2 —cot 0= T - <7? tan 0. 


This relation can be written in terms of the angle of the fracture 
plane, using Eqn 7 and trigonometric identities, 


tan 0 = -cot 20 =- 

tan 26 


tan 2 6-1 
2 tan 6 


( 12 ) 


yielding 

<j 1 =-2t 0 tan0+ <7 2 tan 2 0. (13) 

We will use this relation between the stresses when fracture 
occurs to estimate the maximum stresses in the crust. 

Similar analyses show when the shear stress is high enough 
to overcome friction and cause sliding on a previously existing 
fault. The results are similar to those for a new fracture in 
unbroken rock, except that at low stress levels the preexisting 
fault has no cohesive strength. Thus slip on the fault occurs 
when | t | = -/id, where ji is the coefficient of sliding friction , 
which can be expressed by an angle of sliding friction 


0=20-90°, so 0=0/2 + 45°. 


(7) tan ce = /i. 


(14) 
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Fig. 5.7-4 Left : Geometry of a plane with normal 
e'j, oriented at an angle flfrom e p the direction of 
the maximum compressive stress o v Right : Normal 
stress, cr, and shear stress, t, as functions of the 
angle 0 . 
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Fig. 5.7-5 Mohr’s circle: Given a state of stress described by principal 
stresses and <r 2 , the normal stress, a, and the shear stress, t, for planes 
of all orientations lie on a circle with radius (tJ 2 - 0^/2. The point on the 
circle with angle 26, measured clockwise from the - cr axis, gives a and t 
on a plane whose normal is at an angle 0from the direction of o r 


7^=7 

T 

T o 


I cf 2 a 


-- 

Failure lines: 


~~ 1 T 1 = r 0 — no 



Fig. 5.7-7 Fracture occurs at point F, where a material’s failure line, 
characterized by its cohesive strength, t c , and angle of internal friction, 

0, is tangent to Mohr’s circle. Hence 6 is the angle of the plane on which 
fracture occurs, and F gives the stresses at fracture. Point A is the center of 
Mohr’s circle, B is where the failure line intersects the (Taxis, and T<7 2 is 
perpendicular to the <7axis. For simplicity, only the upper failure line for 
T> 0 is shown in this and subsequent figures. 


No internal friction 




Fig. 5.7-6 The Coulomb-Mohr failure criterion assumes that a material 
fractures when Mohr’s circle intersects the failure line. 


Figure 5.7-9 shows the Mohr’s circle representation of a rock 
with preexisting faults. In addition to the failure line, there is a 
frictional sliding line corresponding to 

T= —jiG= — cr tan a. (15) 


Internal friction (example: n = 1) 




Because the sliding line starts at the origin, it is initially below 
the failure line. Assume that the stresses are large enough that 
Mohr’s circle touches the failure line at the point yielding frac- 


Fig. 5.7-8 With no internal friction, fracture occurs at an angle of 45°. For 
n = 1, the fracture angle is 67.5°, and the fault plane is closer (22.5°) to the 
maximum compression (a 1 ) direction. 








Failure line 

f = T 0 -(Ucr Failure angle 



Sliding angles 

Fig. 5.7-9 Mohr’s circle for sliding on a rock’s preexisting faults. A new 
fracture would form at an angle Of, given by the failure line. However, 
slip will occur on a preexisting fault if there are any with angles between 

and 0 $2 , given by the intersection of the circle with the frictional 
sliding line. 

ture on a plane corresponding to an angle Of. Similarly, the 
frictional sliding line intercepts the circle at two points, corres¬ 
ponding to angles 6 and 0 Si . Thus the rock can fail in several 
ways. If there are preexisting faults with angles 0 Si or 0 S2 , slip on 
these faults may occur. Alternatively, a new fracture may form 
on the plane given by Of. However, because this fracture occurs 
at higher shear stress than is needed for frictional sliding on the 
preexisting faults, sliding is favored over the formation of a 
new fracture. Thus, if the stress has gradually risen to this level, 
sliding on preexisting faults would probably have prevented a 
new fracture from forming. 

This effect can have seismological consequences. The sim¬ 
plest way to use focal mechanisms to infer stress orientations is 
to assume that the earthquakes occurred on newly formed 
faults. However, if the rock had been initially faulted, the 
earthquakes may have occurred on preexisting faults. In the 
representation of Fig. 5.7-9, if faults exist with normals 
oriented between 0 and 0 Si to the maximum compressive stress, 
slip on these faults will occur rather than the formation of a 
new fracture. Thus the inferred stress direction will be some¬ 
what inaccurate. For example, the thrust focal mechanisms 
along the Himalayan front (Fig. 5.6-6) or eastern Andean fore¬ 
land thrust belt (Fig. 5.6-10) have fault planes that rotate as the 
trend of the mountains changes, suggesting that the fault planes 
are controlled by the existing structures, so the P axes only 
partially reflect the stress field. A similar pattern appears for 
T axes along the East African rift (Fig. 5.6-2). In general, stress 
axes inferred from many fault plane solutions in an area seem 
relatively coherent (Fig. 5.6-19). Thus we assume that the crust 
contains preexisting faults of all orientations, so the average 
stress orientation inferred from the focal mechanisms is not 
seriously biased. 
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At this point, it is worth noting other complexities. Both the 
failure and sliding curves may be more complicated than 
straight lines. These curves, known as Mohr envelopes, can be 
derived from experiments at various values of stress. Addi¬ 
tional complexity comes from the fact that water and other 
fluids are often present in rocks, especially in the upper crust. 
The fluid pressure, known as the pore pressure , reduces the 
effect of the normal stress and allows sliding to take place at 
lower shear stresses. This effect is modeled by replacing the 
normal stress a with 6= o-Pf, known as the effective normal 
stress , where Ff is the pore fluid pressure. 2 Because the pore 
pressure is defined as negative, the effective normal stress is 
reduced (less compressive). Similarly, effective principal stresses 
taking into account pore pressure, 

a l = a 1 -Pf and a 2 = (16) 

are used in the fracture theory. 

The relations we have discussed can be used to estimate the 
maximum stresses that the crust can support. Laboratory ex¬ 
periments (Fig. 5.7-10) for sliding on existing faults in a variety 
of rock types find relations sometimes called Byerlee’s law: 

-0.85<t, |(i| < 200 MPa 

t~ 50 — 0.6<j, |<j| >200 MPa. (17) 

These relations, written in terms of the normal and shear 
stresses on a fault, can be used to infer the principal stress as 
a function of depth. To do so, we write the minimum com¬ 
pressive stress as (7 3 , because we are in three dimensions. We 
assume that the crust contains faults of all orientations, and 
that the stresses cannot exceed the point where Mohr’s circle is 
tangent to the frictional sliding line, or else sliding will occur 
(Fig. 5.7-11). At shallow depths where |<r| < 200 MPa, Eqn 17 
shows that t 0 = 0. Thus Eqn 13, the relation between the 


stresses when fracture occurs, yields 

6 1 = <r 3 tan 2 0 S . (18) 

Using Eqn 7 for the case of frictional sliding, 

0 s = a/2 + 45 0 , (19) 

and the values in Eqn 17 give 

ju = tana=0.85, a® 41°, 0 5 ~66°, tan 2 66°-5, (20) 

so the stresses are related by 

03 ~ 5a 3 . (21) 

At greater depths, where |<t| >200 MPa, a~ 31° and 0 S = 60.5°, 
so the stresses are related by 

<7^-175+ 3.1<7 3 . (22) 


2 The role of pore pressure in making sliding easier can be seen by trying to slide an 
object across a dry table and then wetting the table. 
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Fig. 5.7-10 Shear stress versus normal 
stress for frictional sliding, compiled for 
various rock types. Compressive stress is 
positive. (Byerlee, 1978. Pure Appl. 
Geophys., 116 , 615-26, reproduced with 
the permission of Birkhauser.) 



Fig. 5.7-11 Mohr’s circle and sliding line for |cr| < 200 MPa. If the 
lithosphere contains fractures in all directions, the stresses cannot 
exceed those at the point where Mohr’s circle is tangent to the sliding 
line, because sliding would occur. 

We assume that one principal stress, <j 1 or <j 3 , is the vertical 
stress due to the lithostatic pressure as a function of depth (^), 

(jy = —pg z . (23) 

The other principal stress, which must be horizontal, is denoted 
%. The pore pressure P^(z) is unknown. One common assump¬ 
tion is that the rock is dry, so P^z) = 0. Another is that the pore 
pressure is hydrostatic , which is equivalent to assuming that 
pores are connected up to the surface, so 


where pf is the density of the fluid, which is usually water, with 
Pf= 1 g/cm 3 . Alternatively, the pore pressure can be assumed to 
be a fixed fraction of the lithostatic pressure (Section 2.3.6). 

We now can find the strength of the crust, defined by the 
maximum difference between the horizontal and vertical 
stresses that the rock can support. At shallow depths where 
|a| < 200 MPa, Eqn 21 shows that g x = 5(7 3 . There are two 
possibilities, depending on whether the vertical stress is the 
most (oy) or least (cr 3 ) compressive. If the vertical stress is the 
most compressive, 

Gy — CT| 5 CF j = Gy — P^ — ~pgZ ~ Pf{z) 

3 > o 3 = G 1 l5=-(pgz + P f {z))!5. (25) 

Alternatively, if the vertical stress is the least compressive, 

Gy= CP,, <7 3 = Gy~ Pf = -pgZ~~ Pf{z) 

%=f T 1 , G 1 = 5G 3 = -5{pgz + P f {z)). (26) 

In the first case, 

%- CT y = d 3 - CT-! = 0.8 (pgz + Pf(z)), (27) 

corresponds to an extensional (positive) stress. In the second, 
g h - (7 V = Oh - tf 3 = -4( pgz + P f (z)) (28) 


P f (z) = -p f gz , 


corresponds to a compressive (negative) stress that is much 
(24) greater in absolute value. Thus, at any depth, the crust can 
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Fig. 5.7-12 Horizontal stresses measured in southern Africa. Dots are 
for horizontal stresses being the least compressive (cf 3 ), and triangles are 
for horizontal stresses being the most compressive (cq). The lithostatic 
stress gradient (26.5 MPa/km) is shown, along with Byerlee’s law (BY) 
for zero pore pressure (DRY). The stronger line is for compression, 
and the weaker one is for extension. The observed stresses are within the 
maximum and minimum BY-DRY lines. (Brace and Kohlstedt, 1980. 

J. Geophys. Res., 85, 6248-52, copyright by the American Geophysical 
Union.) 

support greater compressive deviatoric stress than extensional 
deviatoric stress (Fig. 5.7-12). 

5.7. 3 Ductile flow 

When rocks behave brittlely, their behavior is not time- 
dependent; they either strain elastically or fail. By contrast, the 
deformation of ductile rock depends on time. A common model 
for the time-dependent behavior is a Maxwell viscoelastic 
material , which behaves like an elastic solid on short time 
scales and like a viscous fluid on long time scales. This model 
can describe the mantle because seismic waves propagate as 
though the mantle were solid, whereas postglacial rebound and 
mantle convection occur as though the mantle were fluid. 

To see this difference, consider two types of deformation 
in one dimension. For an elastic solid subjected to elastic strain 

e E = e u? 

a=Ee E , (29) 

where E is Young’s modulus, and a is o lv The simplest viscous 
fluid obeys 

a = 2n%, (30) 

dt 


where r\ is the viscosity, and e F is the fluid portion of the strain. 
This equation defines the viscosity, the property that measures 
a fluid’s resistance to shear. 3 

We often think of an elastic material as a spring, which 
exerts a force proportional to distance. Thus stress and 
strain are proportional at any instant, and there are no time- 
dependent effects. By contrast, the viscous material is though of 
as a dashpot, a fluid damper that exerts a force proportional to 
velocity. Hence the stress and strain rate are proportional, and 
the material’s response varies with time. These effects are com¬ 
bined in a viscoelastic material, which can be thought of as a 
spring and dashpot in series (Fig. 5.7-13). The combined elastic 
and viscous response comes from the combined strain rate 


This differential equation, the rheological law for a Maxwell 
substance, shows how the stress in the material evolves after 
a strain e 0 is applied at time t = 0 and then remains constant. At 
t = 0 the derivative terms dominate, so the material behaves 
elastically, and has an initial stress 


°o = Ee o- 


For t > 0, de/dt = 0, so 


whose integral is 


o{t) = c> exp [-{EtHvi)]. 


Thus stress relaxes from its initial value as a function of time 
(Fig. 5.7-13). A useful parameter is the Maxwell relaxation 


2ri T] 


E /a 


required for the stress to decay to e" 1 of its initial value. For 
times less than t m the material can be considered an elastic 
solid, whereas for longer times it can be considered a viscous 
fluid. 

For example, if the mantle is approximately a Poisson solid 
with jU ~ 10 12 dyn/cm 2 and Tj ~ 10 22 poise, its Maxwell time is 
about 10 10 s or 300 years. Although the viscosity is not that 
well known, so estimates of the Maxwell time vary, it is clear 


3 In familiar terms, viscosity measures how “gooey” a fluid is. Maple syrup is some¬ 
what more viscous than water, and the earth’s mantle is about 10 24 times more viscous. 

4 Definitions of the Maxwell time vary, but always involve the ratio of the viscosity 
to an elastic constant. 
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Maxwell 

relaxation 


that we can treat the mantle as a solid for seismological 
purposes and as a fluid in tectonic modeling. If we model the 
mantle as viscoelastic, then a load applied on the surface has an 
effect that varies with time. Figure 5.7-13c shows the effect of a 
150 km-wide sediment load, as might be expected on a passive 
continental margin. Initially, the earth responds elastically, 
causing large flexural bending stresses. With time, the mantle 
flows, so the deflection beneath the load deepens and the 
stresses relax. In the time limit, the stress goes to zero, and the 
deflection approaches the isostatic solution, because isostasy 
amounts to assuming that the lithosphere has no strength. 
Stress relaxation may explain why large earthqukes are rare 
at continental margins, except where glacial loads have been 
recently removed (Fig. 5.6-20). Although the large sediment 
loads should produce stresses much greater than other sources 
of intraplate stress, including the smaller and less dense ice 
loads, the stresses produced by sediment loading early in the 
margin’s history may have relaxed. 

Laboratory experiments indicate that the rheology of 
minerals in ductile flow can be described by 
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Fig. 5.7-13 (a) Model of a viscoelastic material as an elastic spring and 
viscous dashpotin series, (b) Stress response of a viscoelastic material to 
an applied strain. The Maxwell relaxation time, t m , is the time the stress 
takes to decay to e 4 of its initial value, (c) Evolution of the deflection and 
bending stress produced by a sediment load on a viscoelastic earth. At first 
the earth responds elastically, as shown by the long-dashed line, but with 
time it flows, so the deflection beneath the load deepens and the stresses 
relax. (Stein et al., 1989, with kind permission from Kluwer Academic 
Publishers.) 


PI = e = f(a) A exp [-(£* + P V*)/RT], (36) 

dt 

where T is temperature, R is the gas constant, and P is pressure. 
f(o) is a function of the stress difference \o 1 - < 7 3 |, and A is 
a constant. The effects of pressure and temperature are de¬ 
scribed by the activation energy E* and the activation volume 
V*. Observed values of f(a) are often fit well by assuming 

f(a) = \a,-a 3 \ n 

e = icq - a 3 1” A exp [-(£* + PV*)/RT]. (37) 

The rheology of such fluids is characterized by a power 
law. If n = 1, the material is called Newtonian , whereas a 
non-Newtonian fluid with n = 3 is often used to represent the 
mantle. From Eqn 30 we see that for a Newtonian fluid the 
viscosity depends on both temperature and pressure: 

77 = (1/2A) exp [(£* +PV*)/RT]. (38) 

Thus the viscosity decreases exponentially with temperature. 
This decrease is assumed to give rise to a strong lithosphere 
overlying a weaker asthenosphere, and the restriction of earth¬ 
quakes to shallow depths. 5 For a non-Newtonian fluid, Eqn 30 
gives the effective viscosity , the equivalent viscosity if the fluid 
were Newtonian. 

We think of equations like Eqn 37 as showing the strength, 
or maximum stress difference |<7 1 - a 3 |, that the viscous 
material can support. This stress difference depends on 
temperature, pressure, strain rate, and rock type. The material 


5 Temperature-dependent viscosity is an effect familiar to automobile drivers in cold 
temperatures, when the engine and the transmission became noticeably sluggish. 












5.7 Faulting and deformation in the earth 357 


is stronger at higher strain rates, and weakens exponentially 
with high temperatures. At shallow depths, the small pressure 
effect is often neglected, so the activation volume V* is treated 
as zero. For example, a commonly used flow law for dry olivine 


= 7 x 10 4 1 ^ — cj 3 I 3 exp 


~0.52MJ/mol 


for | a 1 - a 3 1 < 200 MPa 


„nii -0.54MJ/mol [o'! - cr 2 ] 

= 5.7 xlO 11 exp - 1-—- 

RT 8500 


for | of - cr 3 1 > 200 MPa, 


diere e is in s h Similarly, for quartz, 


= 5 x 10 6 | of — 1 3 exp 


-0.19MJ/mol 


for I cr, - Ch I < 1000 MPa. 
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Fig. 5.7-14 Strength envelopes as a function of depth for various values of 
X, the ratio of pore pressure to lithostatic pressure. BY-HYD lines are for 
Byerlee’s law with hydrostatic pore pressure. At shallow depths, strength 
is controlled by brittle fracture; at greater depths ductile flow laws predict 
rapid weakening. In the ductile flow regime, quartz is weaker than olivine. 
In the brittle regime, the lithosphere is stronger in compression (right side) 
than in extension {leftside). (Brace and Kohlstedt, 1980./. Geophys. Res., 
85, 6248-52, copyright by the American Geophysical Union.) 


At a given strain rate, quartz is much weaker (can sustain a 
smaller stress difference) than olivine. Thus the quartz-rich 
continental crust should be weaker that the olivine-rich oceanic 
crust, an effect whose tectonic consequences are discussed 
next. 

5.7.4 Strength of the lithosphere 

The strength of the lithosphere as a function of depth depends 
upon the deformation mechanism. At shallow depths, rocks 
fail by either brittle fracture or frictional sliding on preexisting 
faults. Both processes depend in a similar way on the normal 
stress, with rock strength increasing with depth. However, at 
greater depths the ductile flow strength of rocks is less than the 
brittle or frictional strength, so the strength is given by the flow 
laws and decreases with depth as the temperatures increase. 
This temperature-dependent strength is the reason why the 
cold lithosphere forms the planet’s strong outer layer. 

To calculate the strength, a strain rate and a geotherm giving 
temperature as a function of depth are assumed. At shallow 
depths the strength, the maximum stress difference before 
frictional sliding occurs, is computed using Eqns 27 and 28. At 
some depth, the frictional strength exceeds the ductile strength 
allowed by the flow law, so for deeper depths the maximum 
strength is given by the flow law. Figure 5.7-14 shows a 
strength plot, known as a strength envelope , for a strain rate 
of 10” 15 s -1 and a temperature gradient appropriate for old 
oceanic lithosphere or stable continental interior. In the 
frictional region, curves are shown for various values of 7, 
the ratio of pore pressure to lithostatic pressure. The higher 

6 Brace and Kohlstedt (1980). 


pore pressures result in lower strengths. Ductile flow laws are 
shown for quartz and olivine, minerals often used as models for 
continental and oceanic rheologies. Strength increases with 
depth in the brittle region, due to the increasing normal stress, 
and then decreases with depth in the ductile region, due to 
increasing temperature. Hence strength is highest at the brittle- 
ductile transition. Strength decreases rapidly below this trans¬ 
ition, so the lithosphere should have little strength at depths 
greater than about 25 km in the continents and 50 km in the 
oceans. The strength envelopes show that the lithosphere is 
stronger for compression than for tension in the brittle regime, 
but the two are symmetric in the ductile regime. Strength 
envelopes are often plotted using the rock mechanics conven¬ 
tion of compression positive. 

The actual distribution of strength with depth is probably 
more complicated, because the brittle-ductile transition occurs 
over a region of semi-brittle behavior that includes both brittle 
and plastic processes (Fig. 5.7-3). However, this simple model 
gives insight into various observations. In particular, we have 
seen that the depths of earthquakes in several tectonic environ¬ 
ments seem to be limited by temperature. This makes sense, 
because for a given strain rate and rheology the exponential 
dependence on temperature would make a limiting strength for 
seismicity approximate a limiting temperature. 

To see this, consider Fig. 5.7-15, which shows that as 
oceanic lithosphere ages and cools, the predicted strong region 
deepens. This result seems plausible because earthquake depths, 
seismic velocities, and effective elastic thicknesses imply that 
the strong upper part of the lithosphere thickens with age 
(Fig. 5.3-9). The strength envelopes are thus consistent with the 
observation that the maximum depth of earthquakes within 
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Fig. 5.7-15 Strength envelopes showing maximum stress difference 
(strength) as a function of depth for an olivine rheology, for geotherms 
{right) corresponding to cooling oceanic lithosphere of different ages. 
Strength in the brittle regime is reduced by higher pore pressure; strength 
in the ductile regime is reduced by lower strain rate. The depth range in 
which the material is strong enough for faulting increases with age. 

(Wiens and Stein, 1983./. Geophys. Res., 88, 6455-68, copyright by 
the American Geophysical Union.) 

the oceanic lithosphere is approximately bounded by the 750°C 
isotherm (Fig. 5.7-16). These envelopes are drawn for strain 
rates of 10” 15 and 10~ 18 s -1 , appropriate for slow deforma¬ 
tion within plates. By contrast, a seismic wave with a period 
of 1 s, a wavelength of 10 km, and a displacement of 10 -6 m 
corresponds to a strain rate of 10~ 10 s" 1 . The successively 
greater effective elastic thicknesses, depth of the deepest earth¬ 
quakes, and depth of the low-velocity zone are thus consistent 
with strength increasing with strain rate. 

The strength envelopes give insight into differences between 
continental and oceanic lithospheres (Fig. 5.7-17). First, quartz 
is weaker than olivine at a given temperature (Fig. 5.7-14), 
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Fig. 5.7-16 Plots of strength and seismicity versus temperature. The 
strength envelopes explain the observation that intraplate oceanic 
seismicity occurs only above the 750°C isotherm. (Wiens and Stein, 1985. 
Tectonophysics, 116, 143-62, with permission from Elsevier Science.) 



Fig. 5.7-17 Schematic strength envelope for continents. Below the ductile 
lower crust may be a stronger zone in the olivine-rich mantle. (Chen and 
Molnar, 1983./. Geophys. Res., 88, 4183-4214, copyright by the 
American Geophysical Union.) 


consistent with the fact that the limiting temperature for 
continental seismicity is lower than for oceanic earthquakes 
(Fig. 5.7-18). Second, the strength profiles differ. The strength 
of oceanic lithosphere increases with depth and then decreases. 
However, in continental lithosphere we expect such a profile in 

















Fig. 5.7-18 Limiting temperatures for continental seismicity. These 
temperatures are much lower than those for oceanic lithosphere, since the 
quartz rheology in continents is much weaker than olivine. (Courtesy of 
J. Strehlau and R. Meissner.) 

the quartz-rich crust, but also a second, deeper zone of strength 
below the Moho, due to the olivine rheology. This “jelly sand¬ 
wich” profile including a weak zone may be part of the reason 
why continents deform differently than oceanic lithosphere. 
For example, some continental mountain building (Fig. 5.6-6) 
may involve crustal thickening in which slices of upper crust, 
which are too buoyant to subduct, are instead thrust atop one 
another. The weaker lower crust may also contribute in other 
ways to the general phenomenon that continental plate 
boundaries are broader and more complex than their oceanic 
counterparts (Fig. 5.2-4). 

5.7.5 Earthquakes and rock friction 

It is natural to assume that earthquakes occur when tectonic 
stress exceeds the rock strength, so a new fault forms or an 
existing one slips. Thus steady motion across a plate boundary 
seems likely to give rise to a cycle of successive earthquakes 
at regular intervals, with the same slip and stress drop (Fig. 5.7- 
19). However, we have seen that the earthquake process 
is more complicated. The time between earthquakes on plate 
boundaries varies (Fig. 1.2-15), although the plate motion 
causing the earthquakes is steady. Earthquakes sometimes 
rupture along the same segments of a boundary as in earlier 
earthquakes, and other times along a different set (Fig. 5.4-27). 
Moreover, many large earthquakes show a complicated rup¬ 
ture pattern, with some parts of the fault releasing more seismic 
energy than others (Fig. 4.5-10). Attempts to understand these 
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Fig. 5.7-19 Stress and slip history for an idealized earthquake cycle on a 
plate boundary, in which all earthquakes have the same stress drop and 
coseismic slip. (Shimazaki and Nakata, 1980. Geophys. Res. Lett., 7, 
279-82, copyright by the American Geophysical Union.) 

complexities often combine two basic themes. Some of the 
complexity may be due to intrinsic randomness of the failure 
process, such that some small ruptures cascade into large earth¬ 
quakes, whereas others do not (Section 1.2.6). Other aspects of 
the complexity may be due to features of rock friction. 

Interesting insight emerges from considering an experiment 
in which stress is applied until a rock breaks. When the fault 
forms, some of the stress is released, and then motion stops. If 
stress is reapplied, another stress drop and motion occur once 
the stress reaches a certain level. So long as stress is reapplied, 
this pattern of jerky sliding and stress release continues 
(Fig. 5.7-20). 

This pattern, called stick-slip , looks like a laboratory version 
of what happens in a sequence of earthquakes on a fault. By 
this analogy, the stress drop in an earthquake relieves only part 
of the total tectonic stress, and as the fault continues to be 
loaded by tectonic stress, occasional earthquakes occur. The 
analogy is strengthened by the fact that at higher temperatures 
(about 300° for granite), stick-slip does not occur (Fig. 5.7-20). 
Instead, stable sliding occurs on the fault, much as earthquakes 
do not occur at depths where the temperature exceeds a certain 
value. Thus, understanding stick-slip in the laboratory seems 
likely to give insight into the earthquake process. 

Stick-slip results from a familiar phenomenon: it is harder to 
start an object sliding against friction than to keep it going 
once it is sliding. This is because the static friction stopping the 
object from sliding exceeds the dynamic friction that opposes 
motion once sliding starts. 7 To understand how this difference 

7 This effect is the basis of cross-country skiing, where loading one ski makes it grip 
the snow, while unloading the other lets it glide. 
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Fig. 5.7-20 Force versus slip history for a rock sample. At low 
temperature, so long as stress is reapplied, a stick-slip pattern of jerky 
sliding and stress release continues. By contrast, stable sliding occurs 
at high temperature. (Brace and Byerlee, 1970. Science, 168, 1573-5, 
copyright 1970 American Association for the Advancement of Science.) 


causes stick-slip, and get insight into stick-slip as a model for 
earthquakes, consider the experiment in Fig. 5.7-21. It turns 
out that if an object is pulled across a table with a rubber band, 
jerky stick-slip motion occurs. 8 Thus a steady load, combined 
with the difference in static and dynamic friction, causes an 
instability and a sequence of discrete slip events. 

We analyze this situation assuming that a block (sometimes 
called a slider) is loaded by a spring that applies a force /propor¬ 
tional to the spring constant (stiffness) k and the spring exten¬ 
sion. If the loading results from the spring’s far end moving at a 
velocity v, the spring force is 



Spring force 



Fig. 5.7-21 A simple spring and slider block analog for stick-slip as a 
model for earthquakes. The slider is loaded by force f due to the spring 
end moving at velocity v. Before sliding, the block is retarded by a static 
friction force t=-ji s a, but once sliding starts, the friction force decreases 
to A series of slip events occur, each with slip A u and force change 
(stress drop) A f. 


However, the block starts sliding only once the spring force 
exceeds the frictional force, so just before sliding starts at t = 0, 

0 = k^+jLL s a, (43) 

where ji s is the static friction coefficient. For simplicity, assume 
that at the instant sliding starts, the friction drops to its 
dynamic value ji d , and 


f=k(C+vt-u), (41) 

where u is the distance the block slipped, and f is the spring 
extension when sliding starts at t = 0. This motion is opposed 
by a frictional force | t | = ~jia equal to the product of a, the 
compressive (negative) normal stress due to the block’s weight, 
and the friction coefficient, fi. By Newton’s second law that 
force equals mass times acceleration, 



= k(£-u)+fi d (r. 


Subtracting Eqn 43 from Eqn 44 gives 



= -ku + {fi d - ji s )o= ~ku + A fia, 


(44) 


(45) 


d 2 u 



-f-r=k(^+vt~u) + jaa. 


We suggest trying this experiment. 


which we can use as the equation of motion for the block’s slip 
(42) history u(t) if the loading rate v is slow enough to ignore during 
the slip event. 

A solution to Eqn 45, with initial conditions u( 0) = 0 and 

dm n . 

~^r =0 ’ is 
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u(t) = - 7 ^(l- c °s at) 
k 


(slip), 


du{t) 

dt 


Ajia 
A I km 


sin cot 


(velocity), 


du 2 {t) AfiG . 

---cos cot (acceleration) 

dt 2 m 


(46) 


where co = fklm. As shown, the block starts slipping because 
the spring force exceeds the friction force. During the slip 
event, the spring force decreases as the spring shortens, until it 
becomes less than the friction force and the block slows and 
eventually stops. The block stops once the shaded area above 
the spring force line equals that below the line, or when the 
work done accelerating the block equals that which decelerated 
it. If the spring end continues to move, loading continues until 
the spring force again equals the static friction force and 
another slip event occurs. 

It is interesting to think of analogies between this model of 
slip events and earthquakes. The slip event’s duration t D , 
analogous to an earthquake rise time (Section 4.3,2), satisfies 


dui f D ) _ n 

dt ’ J 


n^mlk 


(47) 


The total slip during the event is 


Slow Fast 



Slip -► 


Fig. 5.7-22 Evolution of friction in a simple rate- and state-dependent 
model. If the slip rate increases by a factor of e, friction increases by a , 
and then decreases as slip progresses to a steady-state value a-b. (After 
Scholz, 1990. Reprinted with the permission of Cambridge University 
Press.) 

rate- and state-dependent friction with a variable coefficient of 
sliding friction, /i, are used to describe these effects. In a simple 
model of this sort, 

jll= [p 0 + by/+aln (v/v*)], (50) 

where is the coefficient of static friction. The friction de¬ 
pends on the slip rate v, normalized by a rate v *, and a state 
variable \jf that represents the slip history 

■^- = -(v/L)[ ¥ +\n(v/v»)], (51) 

dt 


An - u{t D ) = 2A/K7/T, (48) 

and the drop in the spring force, which is analogous to an 
earthquake stress drop (Section 4.6.3), is 

Af=2Apa. (49) 

Thus the rise time depends on the spring constant, but not on 
the difference between static and dynamic friction. However, 
the total slip and stress drop depend upon the friction differ¬ 
ence. None of these depend upon the loading rate, which is 
analogous to the rate of plate motion causing earthquakes 
on a plate boundary. But the loading rate determines the time 
between successive slip events. Thus, in the plate boundary 
analogy, the time between large earthquakes depends on the 
plate motion rate, but their slip and stress drop depend on the 
frictional properties of the fault and the normal stress. Hence 
faster-slipping boundaries would have more frequent large 
earthquakes, but the slip and stress drop in them would not 
be greater than on a slower boundary with similar frictional 
properties and normal stress. 

Laboratory experiments show that the difference between 
static and dynamic friction is more complicated than the con¬ 
stant values assumed in this simple model. We can think of the 
lower dynamic friction as showing either velocity weakening, 
decreasing as the object moves faster, or slip weakening, 
decreasing as the object moves further. Frictional models called 


where L is an experimentally determined characteristic dis¬ 
tance. The friction also depends on a and b, which characterize 
the material. 

Figure 5.7-22 illustrates how friction evolves. If the slip rate 
increases by a factor of e, the friction increases by a , and then 
decreases as slip progresses, reaching a new steady-state value. 
With time, preaches a steady-state value given by Eqn 51, 

0 = -(t//L)[v^ + ln (vlv*)], ^ = -1* (^*h (52) 

The steady state friction (Eqn 50) is 

fi ss ~ [pQ + by+aln (v/v*)] = [p 0 + (a-b) In (vlv*) ], (53) 

and varies with slip rate as 

-^ = (a-b), (54) 

dmv 

so after the slip velocity change, the net friction change is (a - 
b).li(a-b) is negative, the material shows velocity weakening, 
which permits earthquakes to occur by stick-slip. However, for 
(a-b) positive, the material shows velocity strengthening, and 
stable sliding is expected. Laboratory results (Fig. 5.7-20) show 
that a-b for granite changes sign at about 300°, which should 
be the limiting temperature for earthquakes. Thus the frictional 
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Earthquake Earthquake Earthquake 



Time interval 

A to B 300 days 
B to B' instantaneous 
B'toC 2.5 min 
C to D 4.2 days 
D to E 1.9 yr 
E to F 9.1 yr 
FtoG 82.5 yr 

T cy = 92.9 yr 


Fig. 5.7-23 Earthquake cycle for a model in which a strike-slip fault with rate- and state-dependent frictional properties is loaded by plate motion. The slip 
history for three cycles as a function of depth and time is shown by the lines, each of which represents a specific time. Steady motion occurs at depth, and 
stick-slip occurs above 11 km. (After Tse and Rice, 1986. J. Geophys. Res., 91, 9452-72, copyright by the American Geophysical Union.) 


model predicts a maximum depth for continental earthquakes 
similar to that predicted by the rock strength arguments. 

These results can be used to simulate the earthquake 
cycle, using fault models analogous to the simple slider model 
{Fig. 5.7-21). Figure 5.7-23 shows the slip history as a function 
of depth and time for a model in which a strike-slip fault is 
loaded by plate motion. The fault is described by rate- and 
state-dependent frictional properties as a function of depth, 
such that stick-slip occurs above 11 km. Initially from time A 
to B, stable sliding occurs at depth, and a little precursory 
slip occurs near the surface. The earthquake causes 2.5 m of 
sudden slip at shallow depths, as shown by the curves for times 
B and Bk As a result, the faulted shallow depths “get ahead” 
of the material below, loading that material and causing 
postseismic slip from times B' to F. Once this is finished, the 
93-year cycle starts again with steady stable sliding at depth. 

Such models replicate many aspects of the earthquake cycle. 
An interesting difference, however, is that the models predict 
earthquakes at regular intervals, whereas earthquake histories 
are quite variable. Some of the variability may be due to the 
effects of earthquakes on other faults, or other segments of 
the same fault. Figure 5.7-24 shows this idea schematically 
for the slider model in Fig. 5.7-21. Assume that after an earth¬ 
quake cycle, the compressive normal stress cron the slider is re¬ 
duced. This “unclamping” reduces the frictional force resisting 
sliding, so it takes less time for the spring force to rise again to 
the level needed for the next slip event. Conversely, increased 
compression “clamps” the slider more, and so increases the 
time until the next slip event. In addition, by Eqn 49, the stress 
drop in the slip event changes when a changes. 


Spring Stress 

force "unclamping" 



Fig. 5.7-24 Modification of a slider block model {Fig. 5.7-21) to include 
the effects of changes in normal stress. Reduced normal stress (| <7 [ < | o' \) 
reduces the frictional force, and so “unclamps” the fault and decreases the 
time until the next slip event. 


For earthquakes, the analogy implies that earthquake occur¬ 
rence on a segment of a fault may reflect changes in the stress on 
the fault resulting from earthquakes elsewhere. This concept is 
quantified using the Coulomb-Mohr criterion (Eqn 5) that 
sliding can occur when the shear stress exceeds that on the slid¬ 
ing line (Fig. 5.7-9), or r> fit j. We can thus define the Coulomb 
failure stress 

a f =x+iia (55) 

such that failure occurs when cr^is greater than zero. Whether a 
nearby earthquake brings a fault closer to or further from fail¬ 
ure is shown by the change in Coulomb failure stress due to the 
earthquake, 








^v/vAA^ 
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Fig. 5.7-25 Predicted changes in Coulomb failure stress due to the 
1971 San Fernando earthquake. The Whittier Narrows and Northridge 
earthquakes subsequently occurred in regions where the 1971 earthquake 
increased the failure stress. (Stein etaL, 1994. Science, 265 ,1432-5, 
copyright 1994 American Association for the Advancement of Science.) 

Ao^ = AT+juAa (36) 

Failure is favored by positive Aov, which can occur either from 
increased shear stress t or a reduced normal stress (compres¬ 
sion is negative, so Ao 0 favors sliding). 

Some earthquake observations provide support for this 
idea. Figure 5.7-25 shows the predicted Coulomb failure stress 
changes in the Los Angeles region due to the 1971 (M s 6.6) 
San Fernando earthquake. The stress change pattern reflects 
the earthquake’s focal mechanism, thrust faulting on a NW- 
SE-striking fault (Fig. 5.2-3). Two moderate earthquakes, the 
1987 Whittier Narrows (M L 5.9) and 1994 Northridge (M w 
6.7) earthquakes subsequently occurred in regions where the 
1971 earthquake increased the failure stress, suggesting that 
the stress change may have had a role in triggering the earth¬ 
quakes. A similar pattern has been found after other earth¬ 
quakes, and some studies have found that aftershocks are 
concentrated in regions where the mainshock increased the 
failure stress. Stress triggering may explain why successive earth¬ 
quakes on a fault sometimes seem to have a coherent pattern. 
For example, the 1999 M s 7.4 Izmit earthquake on the North 
Anatolian fault (Fig. 5.6-8) appears to be part of a sequence 
of major (M s 7) earthquakes over the past 60 years, which 
occurred successively further to the west, and hence closer to 
the metropolis of Istanbul. 

An intriguing feature of such models is that the predicted 
stress changes are of the order of 1 bar, or only 1-10% of the 
typical stress drops in earthquakes (Section 4.6.3). Such small 


stress changes should only trigger an earthquake if the tectonic 
stress is already close to failure. However, as in the slider model 
(Fig. 5.7-24), stress changes can affect the time until the tec¬ 
tonic stress is large enough to produce earthquakes. It has been 
argued that the 1906 San Francisco earthquake reduced the 
failure stress on other faults in the area, causing a “stress 
shadow” and increasing the expected time until the next earth¬ 
quake on these faults. This is consistent with the observation 
that during the 75 years before the 1906 earthquake, the 
area had 14 earthquakes with M w above 6, whereas only one 
occurred in the subsequent 75 years. Such analyses may help 
improve estimates of the probability that an earthquake of a 
certain size will occur on a given fault during some time period. 
To date, such estimates have large uncertainties (Section 4.7.3), 
in part because of the large variation in the time intervals 
between earthquakes. Stress loading models, some of which 
incorporate rate- and state-dependent friction because simple 
Coulomb friction does not predict large enough changes in 
recurrence time, may explain some of the variations and thus 
reduce these uncertainties. 

This discussion brings out the importance of understanding 
the state of stress on faults. On this issue, the friction models 
give some insight, but major questions remain. Earthquake 
stress drops estimated from seismological observations are 
typically less than a few hundred bars (tens of MPa). Yet, the 
expected strength of the lithosphere (e.g., Fig. 5.7-14-16) 
is much higher, in the kilobar (hundreds of MPa) range. The 
laboratory results (Fig. 5.7-20) and frictional models (Fig. 
5.7-21) suggest an explanation for this difference, because in 
both the stress drop during a slip event is only a fraction of the 
total stress. 

However, the frictional models do not explain an intriguing 
problem called the “San Andreas” or “fault strength” paradox. 
As noted in Section 5.4.1, a fault under shear stress r slipping at 
rate v should generate fractional heat at a rate equal to %v. 
Thus, if the shear stresses on faults are as high (kbar or hund¬ 
reds of MPa) as expected from the strength envelopes, signi¬ 
ficant heat should be produced. But little if any heat flow 
anomaly is found across the San Andreas fault (Fig. 5.7-26), 
suggesting that the fault is much weaker than expected. A sim¬ 
ilar conclusion emerges from consideration of stress orientation 
data. Although the Coulomb-Mohr model predicts that the 
maximum principal stress directions inferred from focal mech¬ 
anisms, geological data, and boreholes should be about 23° 
from the San Andreas fault (Fig. 5.7-8), the observed directions 
are essentially perpendicular to the fault (Fig. 5.6-19), implying 
that the fault acts almost like a free surface. To date, there is no 
generally accepted explanation for these observations. The 
most obvious one is that the effective stress on the fault is re¬ 
duced by high pore pressure, but there is discussion about 
whether pressures much higher than hydrostatic pressure could 
be maintained in the fault zone. An alternative explanation, 
that the fault zone is filled by low-strength clay-rich fault 
gouge, faces the difficulty that experiments on such material 
find that it has normal strength unless pore pressures are high. 
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Distance from San Andreas fault (km) 

Fig. 5.7-26 Observed (squares) heat flow across the San Andreas fault. 
The elevated heat flow predicted by shear heating (solid line) is not 
observed, except for one point (CJON, Cajon pass), where alternative 
interpretations are possible, implying that the fault is weak. (Lachenbruch 
and Sass, 1988. Geophys. Res, Lett ., 15, 981—4, copyright by the 
American Geophysical Union.) 

In summary, ideas based on rock friction are providing 
important insights into earthquake mechanics. Although many 
issues remain unresolved, and some attractive notions remain 
to be fully demonstrated, rock friction seems likely to play a 
growing role in addressing earthquake issues. 
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5.7.6 Earthquakes and regional deformation 

The large, rapid deformation in earthquakes is often part of a 
slow deformation process occurring over a broader region. As 
discussed in Section 5.6.2, there often appear to be differences 
between the seismic, aseismic, transient, and permanent 
deformations sampled by different techniques on different time 
scales. Experimental and theoretical ideas about rheology and 
lithospheric dynamics are being used to investigate the relation 
between earthquakes and the regional deformations that pro¬ 
duce them. 

We have seen that earthquakes often reflect deformation dis¬ 
tributed over a broad plate boundary zone. In this case, we can 
think of the lithosphere as a viscous fluid and use earthquakes 
as indicators of its deformation. This idea is like the physical 
model (Fig. 5.6-7) that used deformable plasticine as an ana¬ 
logy for the deformation of Asia resulting from the Himalayan 
collision. Figure 5.7-27 shows such an analysis for part of the 
Pacific-North America plate boundary zone in the western 
United States. The deformation is assumed to result from a 
combination of forces due to the transform plate boundary and 
forces due to the potential energy of elevated topography, 
which tends to spread under its own weight. To test this idea, a 
continuous velocity field has been interpolated from space- 
geodetic, fault slip, and plate motion data (Figs 5.2-3 and 5.6- 
3). The velocity field is treated as being due to the motion of a 
viscous fluid, and is converted to a strain rate tensor field. This 
is then compared to the magnitude of the stress tensor inferred 
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Fig. 5.7-27 Left : Estimated velocity field for part of the Pacific-North America plate boundary zone in the western USA. Right: Effective viscosity 
determined by dividing the magnitude of the deviatoric stress tensor by the magnitude of the strain rate tensor. (Flesch et al, 2000. Science 287 834-6 

copyright 2000 American Association for the Advancement of Science.) ’ 
















from topography and plate boundary forces. The ratio of stress 
to strain rate at any point, which is the vertically averaged 
effective viscosity, varies significantly. Low values along the 
San Andreas fault and western Great Basin show that the strain 
rates are relatively high for the predicted stress, consistent with 
a weak lower crust. The Great Valley-Sierra Nevada block has 
little internal deformation, and thus acts relatively rigidly and 
appears as a high-viscosity region. Summing seismic moment 
tensors (Section 5.6.2) yields a seismic strain rate averaging 
about 60% of the inferred total strain. As discussed earlier, this 
discrepancy may indicate some aseismic deformation or that 
the 150 years of historical seismicity is too short for a reliable 
estimate. 

Viscous fluid models can be used to study how the litho¬ 
sphere deforms on different time scales. For example, as noted 
in Section 5.6.2, GPS data across the entire Nazca-South 
America plate boundary zone show faster motion than is 
inferred from structural geology or topographic modeling. 
The difference probably occurs because the GPS data record 
instantaneous velocities that include both permanent deforma¬ 
tion and elastic deformation that will be recovered during 
future earthquakes, whereas the lower geological rates reflect 
only the permanent deformation. This can be modeled by rep¬ 
resenting the overriding South American plate using a simple 
one-dimensional system of a spring, a dashpot, and a pair of 
frictional plates (Fig, 5.7-28). This system approximates the 
behavior of the crust: the spring gives the elastic response over 
short periods, the dashpot gives the viscous response over geo¬ 
logical time scales, and the frictional plates simulate the thrust 
faulting earthquake cycle at the trench. As plate convergence 
compresses the system, the stress o(t) increases with time 
until it reaches a yield strength a y , when an earthquake occurs, 
stress drops to and the process repeats. Displacement 
accumulates at a rate v 0 except during earthquakes, when the 
displacement drops by an amount A u. The topography and 
geologic data record the averaged long-term shortening rate v c 
shown by the envelope of the sawtooth curve, whereas GPS 
data record the higher instantaneous velocity v 0 . The instanta¬ 
neous velocity thus results from the portion of the plate motion 
locked at the trench that deforms the overriding plate elastic- 
ally (Fig. 4.5-14) and is released as seismic slip in interplate 
earthquakes. By contrast, the aseismic slip component at the 
trench has no effect because it does not contribute to locking on 
the interface and deformation of the overriding plate. Similar 
models are being explored for other regions where deformation 
appears to vary on different time scales. 

Viscous fluid models are also used to analyze other aspects 
of the earthquake cycle. For example, Fig. 5.7-29 shows the 
strain rate near portions of the San Andreas fault compared to 
the time since the last great earthquake on that portion of the 
fault. Postseismic motion seems to continue for a period of 
years after an earthquake and then slowly decays, presum¬ 
ably due to the steady interseismic motion. A similar picture 
emerges from GPS and other geodetic results following large 
trench thrust faulting earthquakes. For a number of years, sites 
near the trench on the overriding plate move seaward, showing 


a. Stick-slip E rj 



Fig. 5.7-28 a: Model for a viscoelastic-plastic crust to describe the 
response of the overriding South American plate to the subduction of 
the Nazca plate. The dashpot represents a viscous body modeling the 
permanent deformation, the spring represents an elastic body modeling 
the transient deformation, and the frictional plates represent the 
earthquake cycle at the trench, b: Stress evolution for the model, 
c: Displacement history for the model. Displacement accumulates at the 
instantaneous rate v Q except during earthquakes, when slip A u occurs. 

GPS data record a gradient starting at v Q from the trench, whereas the 
envelope of the displacement curve v c is the long-term shortening rate 
reflected in geological records and topography. (Liu etal, 2000. Geophys. 
Res. Lett., 18 ,3005- 8, copyright by the American Geophysical Union.) 


postseismic motion consistent with the earthquake focal 
mechanism (Fig. 4.5-15). Eventually, however, the sites resume 
the landward interseismic motion usually seen at trenches 
(Fig. 5.6-10). Such observations are challenging to interpret, 
because postseismic afterslip on or near a fault can have effects 
at the surface similar to viscoelastic flow of the asthenosphere 
(Fig. 5.7-29), but offer the prospect of improving our under¬ 
standing of both earthquake processes and the rheology of the 
lithosphere and the asthenosphere. A tantalizing possibility is 
that the viscous asthenosphere permits stress waves generated 
by large earthquakes to travel slowly for large distances and 
contribute to earthquake triggering. 









366 Seismology and Plate Tectonics 



Time since last great earthquake (yr) 


Fig. 5.7-29 Shear strain rate near portions of the San Andreas fault 
compared to the time since the last great earthquake. The data are similar 
to the predictions of two alternative models: viscoelastic stress relaxation 
(solid curve) and aseismic postseismic slip beneath the earthquake fault 
plane (dashed line). (Thatcher, 1983./. Geophys. Res., 88,5 893-902, 
copyright by the American Geophysical Union.) 


Further reading 

Given the comparatively recent discovery of plate tectonics, its importance 
for most aspects of geology, and its crucial role in the earthquake process, 
many excellent sources, a few of which are listed here, offer more informa¬ 
tion about this chapter’s topics. 

The dramatic development of plate tectonics is discussed from the view 
of participants by Menard (1986) and in Cox’s (1973) collection of classic 
papers. Basic ideas in plate tectonics are treated in most introductory and 
structural geology texts. More detailed treatments include Uyeda (1978), 
Fowler (1990), Kearey and Vine (1990), and Moores and Twiss (1995). 
Cox and Hart (1986) present the basic kinematic concepts, and global 


plate motion models are discussed by Chase (1978), Minster and Jordan 
(1978), and DeMets etal. (1990). 

Thermal and mechanical aspects of plate tectonics are discussed by 
Turcotte and Schubert (1982) and Sleep and Fujita (1997). Mid-ocean 
ridge tectonics and structure are discussed by Solomon and Toomey (1992) 
and Nicolas (1995). The thermal evolution of oceanic lithosphere is 
discussed by Parsons and Sclater (1977) and Stein and Stein (1992); 
McKenzie (1969) presents the subduction zone thermal model we follow. 
Papers in Bebout et al. (1996) cover many aspects of subduction, and 
Kanamori (1986) reviews subduction zone thrust earthquakes. Lay (1994) 
treats the nature and fate of subducting slabs, and deep earthquakes are 
reviewed by Frohlich (1989), Green and Houston (1995), and Kirby etal. 
(1996b). For a derivation of the ridge push force see Parsons and Richter 
(1980); Wiens and Stein (1985) discuss its application to oceanic intraplate 
stresses. Yeats etal. (1997) cover a wide variety of topics about the relation 
of earthquakes to regional geology. Rosendahl (1987) reviews continental 
rifting. Papers in Gregersen and Basham (1989) treat aspects of passive 
margin and continental interior earthquakes with emphasis on postglacial 
effects. 

Concepts in continental deformation are treated by Molnar (1988) and 
England and Jackson (1989); Gordon (1998) gives an overview of plate 
rigidity and diffuse plate boundaries. Applications of space geodesy to 
tectonics are reviewed by papers in Smith and Turcotte (1993) and 
by Dixon (1991), Gordon and Stein (1992), and Segall and Davis 
(1997). Many GPS data and results, including an overview brochure, can 
be found on the University NAVSTAR Consortium WWW site http:// 
www.unavco.org. Stress maps and their interpretations are discussed by 
Zoback (1992) and other papers in the same journal issue; stress maps 
are available at the World Stress Map project WW site http://www- 
wsm.physik.uni-karlsruhe.de. 

Mantle plumes in general are reviewed by Sleep (1992); Nataf (2000) 
and Foulger et al. (2001) discuss seismic imaging of plumes; Smith and 
Braile (1994) discuss the Yellowstone hot spot; and Stein and Stein (1993) 
discuss oceanic hot spot swells. Papers in Peltier (1989) treat many aspects 
of mantle convection; Silver et al. (1988) explore the relationship between 
subduction, convection, and mantle structure; and Christensen (1995) 
reviews the effects of phase transitions on mantle convection. The heat 
engine perspective on global tectonics is discussed by Stacey (1992), and 
Ward and Brownlee (2000) summarize the arguments advocating a crucial 
role for plate tectonics in the origin and survival of life on Earth. 

Topics involving rock mechanics, flow, and their tectonic applications 
are discussed by Jaeger (1970), Weertman and Weertman (1975), Jaeger 
and Cook (1976), Turcotte and Schubert (1982), Kirby (1983), Kirby and 
Kronenberg (1987), and Ranalli (1987). Scholz (1990) and Marone (1998) 
cover topics dealing with the relation of rock mechanics to earthquakes, 
with special emphasis on rock friction. Our treatment of the slider 
model for faulting follows Scholz (1990). Related topics, including issues 
of continental deformation and fault strength, are also treated by papers 
in Evans and Wong (1992). Stein (1999) summarizes the concept of stress 
triggering of earthquakes. 









— Problems----- 

1. Assume that Pacific-North America plate motion along the San 
Andreas fault occurs at 35 mm/yr. 

(a) If all this motion occurs seismically in earthquakes about 
22 years apart, which is a typical recurrence interval for the 
Parkfield fault segment, how much slip would you expect in 
the earthquakes? From Fig. 4.6-7, estimate likely fault 
lengths and magnitudes for such earthquakes. 

(b) Give similar estimates if the earthquakes occur about 
132 years apart, as at Pallett Creek. 

2. Assume that all the earthquakes in the Pallett Creek sequence (Fig. 
1.2-15) involved 4 m of seismic slip. Using the time interval from 
the present to the 1857 earthquake, calculate the seismic slip rate 
on this portion of the San Andreas fault. Next, do so by averag¬ 
ing the recurrence intervals for the past two earthquakes (1857 
and 1812), the past three, and so on for the entire earthquake 
history. What are the implications of this simple experiment for 
seismic slip estimates? What other sources of uncertainty should 
also be considered, and how might they affect this estimate? 

3. (a) Use Table 5.2-1 to find the rate that the Juan de Fuca plate 

subducts beneath North America at 46°N, 125°W. 

(b) If all this motion occurs in large earthquakes, how often would 
you expect an earthquake if the slip in each were 5 m? How 
would this estimate change if the slip were 10 or 20 m? 

(c) How would the answers to (b) change if only 25% or 50% of 
the plate motion occurred by seismic slip? 

(d) Paleoseismic observations and historic records of a tsunami 
imply that this subduction zone has had very large earthquakes 
approximately 500 years apart. Suggest some possibilities in 
view of parts (a)-(c). How might you attempt to distinguish 
between them? 

(e) The crust subducting at this trench is about 10 million years 
old. Given the convergence rate and the observations from 
other trenches in Fig. 5.4-30, what might you infer about the 
moment magnitude of the largest earthquake expected here? 
Find the corresponding seismic moment and suggest a plaus¬ 
ible fault geometry and amount of slip that would also be con¬ 
sistent with the paleoseismic and plate motion observations. 

4. For rigid plates, Eqn 5.2.10 shows that we can find the angular 
velocity vector of one plate from the sum of two others. Show that 
at a point we can also do this for the linear velocity vectors. 

5. The news media sometimes ask “How large would the largest 
possible earthquake be?” Estimate the seismic moment and 
moment magnitude by assuming that all the trenches in the world 
(48,000 km) slip at the same time, that 10 m of slip occurs, and the 
fault width is 250 km. 

6. Estimate the thermal Reynolds number R defined in Eqns 5.3.19 
and 5.4.3, assuming that k- 10~ 6 m 2 s _1 . What does this estimate 
imply about the processes of plate cooling and subduction? 

7. Assume that oceanic lithosphere has a thermal conductivity of 

3.1 Wnr l0 C -1 . 

(a) Find the heat flow for old oceanic lithosphere, assuming a 
linear temperature gradient (Fig. 5.3-8), a basal temperature 
of 1450°C, and a plate thickness of 95 km. 

(b) How would this value change for a basal temperature of 
1350°C and plate thickness 125 km? 

(c) If the lithosphere under a midplate region were thinned to 
50 km while the basal temperature remained 1350°, what 
would the heat flow be, assuming a linear temperature 
gradient? 
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8. A way to get insight into the physics of subduction is to use a 
classic result from fluid mechanics, called Stokes’ problem, which 
describes the terminal velocity v at which a sphere of radius a and 
density p sinks due to gravity in a fluid with viscosity 7] and lower 
density p'. The result is v = 2 ga 2 {p - p')/9rj. Estimate the subduction 
velocity of a slab assuming the slab is a sphere with radius equal to 
half its thickness. To do this, estimate the density contrast from the 
thermal model (Eqn 5.4.14) and a coefficient of thermal expansion 
a = 3 x 10™ 5 °C" 1 . Use a mantle viscosity from Section 5.5.3. 
Because this is a back-of-the-envelope calculation, there is no cor¬ 
rect answer, but you should be able to come up with something 
reasonable (within an order of magnitude or two of reality). 

9. The result that a subducting slab that reaches the core should still 
be thermally distinct (Fig. 5.4-5) may seem surprising. For another 
estimate, use the one-dimensional cooling equation in Section 
5.3.2 to estimate how long a slab should need to warm up to 90% 
of the ambient lowermost mantle temperatures, assuming that 
it were immediately transported to the base of the mantle and that 
k = 10“ 6 m 2 s -1 . 

10. Using the definition of the slab pull force (Eqn 5.4.15): 

(a) Write the force in terms of the age of the subducting plate. 

(b) Explain whether this force would be greater or smaller, and 
why, for increased values of subducting plate age, coefficient 
of thermal expansion, and thermal diffusivity. 

11. Assume that in a subducted slab the depth of the spinel-perovskite 
phase transition deepens from its usual 660 km outside the slab 
to 700 km, and that the core of the slab is 800° colder than the 
surrounding mantle. What is the Clapeyron slope of the phase 
change? 

12. The surface of Venus is much hotter (450°C) than that of Earth. If 
Venus had plate tectonics and the rocks were similar, so that the 
temperature gradient in old lithosphere there were the same as on 
Earth, how would the thickness of the “oceanic” lithosphere 
differ? How would the slab pull and ridge push forces differ? What 
other differences might you expect? 

13. Express the ratio of the slab pull (Eqn 5.4.15) and ridge push 
(Eqn 5.5.6) forces. Explain why this ratio depends on thermal 
diffusivity. Estimate this ratio near a trench where old oceanic 
lithosphere is subducting, assuming that K= 10~ 6 m 2 s” 1 . 

14. To see if momentum can be responsible for the Indian plate’s 
northward motion long after its collision with Asia began, estimate 
the momentum of the Indian plate and that of an ocean liner, and 
compare the two. 

15. Use Mohr’s circle to show why 

(a) Rocks at depth do not fracture under lithostatic pressure 
alone. 

(b) The deviatoric stress needed for fracture increases at greater 
depth. 

16. Suppose that a rock is stressed close to its brittle limit. Show 
graphically which will make the rock fracture sooner: (a) increas¬ 
ing cq or (b) decreasing cq by the same amount (assume a two- 
dimensional case where cq and cq are both negative, and internal 
friction exists). 

17. Suppose that the fracture line for a particular rock isT=80-0.5<7, 
where stresses are in MPa. What angle would the normal to a frac¬ 
ture plane make with cq ? If cq is 400 MPa at failure, what is cq? 

18. For the slider block earthquake model in Section 5.7.5: 

(a) Derive an expression for the time between successive slip 
events. 
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(b) Sketch the force-slip diagram for two different spring con¬ 
stants, and use the sketch to explain how the slip and force 
drop in a slip event change and why. 

(c) For the slider block model, formulate a quantity analogous 
to an earthquake’s seismic moment, and explain why it 
depends on each term. What is the major difference between 
this quantity and the seismic moment? 

(d) Recall the observation (Fig. 4.6-11) that earthquake stress 
drops are similar for a wide range of earthquakes. If the 
slider block model is relevant, what does this imply? 

(e) What conditions might correspond to aseismic slip, which 
could be viewed as the limit of a continuous series of very 
small slip events? 

Computer problems 

C-l. (a) Write a subroutine to compute the rate and azimuth of plate 

motion at a point, given the location and an Euler vector in 
the form (pole latitude, longitude, magnitude). 

(b) Use the Euler vector in Table 5.2-1 to test your program on 
the San Andreas and Aleutian site examples in Section 5.2.1. 

C-2. (a) Find the rate and azimuth of Cocos—North America plate 
motion at 18.3°N, 102.5°W. 

(b) This location is the epicenter of a large 1985 Mexican earth¬ 
quake, whose mechanism had nodal planes whose strike and 
dip are (127°, 81°) and (288°, 9°). Infer from the tectonics of 


the Middle American trench which plane was the fault plane 
Using the methods of Section 4.2, determine the azimuth of 
slip during the earthquake. How does this compare to your 
predicted azimuth? 

C-3. (a) Write a subroutine to add and subtract two Euler vectors 
given in the form (pole latitude, longitude, magnitude). The 
output should be a Euler vector in the same form. 

(b) Use your program to determine the absolute Euler vector for 
the Pacific plate using Table 5.2-1. 

(c) Determine the rate and azimuth of absolute plate motion at 
Hawaii (Fig. 5.2-7). Compare the direction to the Hawaiian- 
Emperor seamount chain. 

C-4. Write a program to plot the temperature distribution in the 
oceanic lithosphere as a function of age using the cooling half¬ 
space thermal model (Eqn 5.3.4). Compute erf(s) (Eqn 5.3.3) 
using either available software or numerical integration as 
discussed in problem 4C-6. 

C-5. (a) Write a program to plot the temperature distribution in a 
subducting slab using the analytic thermal model (Eqn 5.4.3). 
Compute it for a plate subducting at 80 mm/yr at an angle of 
45 . Make assumptions that seem reasonable and justify them. 

(b) Change the program to make the age of the subducting plate a 
parameter and generate temperature fields for different slabs, 
as in Fig. 5.4-6. 

(c) Using the results of (b) and Fig. 5.4-4, estimate a temperature 
above which deep earthquakes are not observed. 




Seismograms as Signals 


We shall introduce the concepts of signal and noise. We define the signal as the desired part of the data and the noise as the unwanted 
part. Our definition of signal and noise is subjective in the sense that a given part of the data is “signal” for those who know how to 
analyze and interpret the data, but it is “noise” for those who do not. For example, for many years the times of the first arrivals of 
P- and S-waves were the only signals conveyed by an earthquake, and the rest of the seismogram, such as surface waves and coda 

waves, had to be considered as useless until appropriate methods of interpretations were found. 

Thus, through the application of a new technique to old data, an analyst can experience a moment of discovery as joyful as a data 
gatherer does using a new observational device. 

Aki and Richards, Quantitative Seismology, 1980 


6.1 Introduction 

Seismology uses various techniques to study the displacement 
field as a function of position and time associated with elastic 
waves in the earth, and to draw inferences from it about the 
nature of seismic sources and the earth. Although some tech¬ 
niques depend on specific aspects of seismic waves in the earth, 
others rely on general properties of functions of space and time. 

We thus often use a class of techniques known as signal 
processing or time series analysis. Signal processing considers 
functions of time or space, also called series or signals, in gen¬ 
eral terms without regard to the specific physics involved. As a 
result, many wave propagation subjects, including seismology, 
radar, sonar, and optics, can be treated in similar ways. The 
signals can have different forms. For example, in seismology, 
we can treat either a continuous ( analog) record of ground 
motion or the digital data that result from representing the 
ground motion as being sampled at discrete intervals, provid¬ 
ing numbers that can be manipulated using a computer. 

In general terms, we can think of filtering a signal, or apply¬ 
ing some operation that modifies the signal. We have already 
discussed several examples. A seismometer is a filter, in that it 
yields a record of ground motion that differs from the actual 
ground motion. Similarly, processes in the earth such as dis¬ 
persion or attenuation have effects that can be described as a 
filter acting on the wave field. We can also consciously apply 
filters to enhance parts of a seismogram or seismic wave field 
and suppress others. In this chapter we extend these ideas by 
considering mathematical approaches that are common to 
such applications and then seeing how these approaches give 


additional insight into the physical processes. We discuss some 
basic concepts and provide references at the end of the chapter 
for more extensive treatments. 

6.2 Fourier analysis 

6.2.1 Fourier series 

In many applications, we use an approach based on the idea 
that any time series can be decomposed into the sum or integral 
of harmonic waves of different frequencies, using methods 
known as Fourier analysis. We derived the properties of seismic 
waves using a harmonic wave, a sinusoid of a single frequency, 
and noted that any wave could be treated as the sum of 
harmonic waves. Thus we showed that waves on a string could 
be viewed as the sum of the string’s normal modes, or standing 
waves (Section 2.2.5), and that waves in a spherical earth can 
be written as the sum of the earth’s normal modes (Section 2.9). 
This concept is especially useful when the components with 
various frequencies behave differently. For example, surface 
waves of different frequencies have different apparent velo¬ 
cities (Section 2.8) and seismic wave attenuation varies with 
frequency (Section 3.7). Similarly, we will see shortly that 
seismometers respond differently to ground motion of different 
frequencies. Fourier analysis lets us decompose the signal 
into harmonic waves, consider each harmonic wave separately, 
and then recombine the harmonic waves. Thus we use this 
approach to analyze situations where the effect of the earth or a 
seismometer can be described by a filter. We also use Fourier 
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Fourier terms 



-772 772 

Time 


Fig. 6.2-1 Successive terms of a Fourier series. Solid lines are sin ( InntIT ); 
dashed lines are cos ( InntIT ). 




where the Kronecker delta, S mn , equals 1 for m = n and 0 other¬ 
wise (Eqn A337). For the special case m = n = 0, the integral in 
Eqn 2 is zero, and the integral in Eqn 3 is twice the value for any 
other m = n. 1 

To express the Fourier series for a given function, we solve 
for the coefficients a n and b n by multiplying both sides of Eqn 1 
by the appropriate sine or cosine term and integrating from 
-T/2 to 772. For example, to find the coefficient a k , where k 
is some particular integer, we multiply by cos ( IkntIT) and 
integrate to get 



analysis to filter a signal when the part that interests us overlaps 
with a part that does not in the time or space domains, but the 
two can be separated in the frequency or wavenumber domains. 

We first consider the decomposition of a signal with a finite 
duration into a Fourier series , or sum of harmonic components 
with different frequencies. We will see later that as the duration 
of the signal becomes infinite, the Fourier series becomes the 
Fourier transform integral. 

The Fourier series for an arbitrary function of time f(t) 
defined over the interval -Til < t < T/2 is 


f{t) = «0 + X a n COS ~~ + 

T 


This series decomposes f(t) into a sum of Fourier terms that are 
sine and cosine functions with different periods, because 
sin [InntIT) and cos ( InntIT ) are periodic with period Tin, or 
frequency n!T (Fig. 6.2-1). Farger values of n correspond to 
shorter periods, or higher frequencies. For n- 0, the cosine 
term equals 1 for all values of \ f, and there is no sine term, 
because it would be zero. 

The sine and cosine Fourier terms are a set of orthogonal 
functions , which means that the integral of the product of two 
different ones over the interval from -772 to T/2 is always zero: 


2knt ^ 2nnt “ i 2 nnt ] , 

cos — <3 0 + 2X COS — +2> M sm — dt. 

\ 1 ) n=1 \ 1 ) n=l V J 


By the orthogonality relations (Eqns 2-4), the only term in the 
sums on the right-hand side whose contribution to the integral 
is nonzero is cos { Inkt/T ), so the equation simplifies to 


2knt r . . . A 2knt\ , T 

cos -y-I fVdt = a k cos 2 — dt = —a k ( 1 + S k0 ), 


which shows that the coefficient a k is 


- 5 k0 2 knt) . , 

—— cos - fit) dt. 

T T /W 


1 The proofs of Eqns 2-4 are left for the problems. 
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Time 


Fig. 6.2-2 The first ten terms of the Fourier series for a ramp function. 
The terms are weighted by their coefficients and then summed. The first 
ten terms give a reasonably good representation of the time function, 
but more terms would do better. 


The a 0 term is simply 

77 2 
r 

f(t)dt , (8) 

-772 

which corresponds to the average value of the function. The 
coefficients of the sine terms are found similarly by 



77 2 


{ 2knt^ 


sm 


T 


f(t) dt. 


-Til 


(9) 


Mathematically, what we have done is to consider the 
function f(t) as being in a vector space whose basis vectors 
(Section A.3.6) are the sine and cosine Fourier terms. The 
coefficients a k and b k are the components that describe the par¬ 
ticular vector /'(f). Thus, multiplying each basis function by the 
appropriate coefficient and then summing yields the function. 


Similarly, the operation of finding the coefficients using the 
integrals in Eqns 7-9 corresponds to finding each component 
of a vector by taking the scalar product with the appropriate 
unit basis vector (Eqn A.3.27). 

Figure 6.2-2 illustrates this idea for a ramp function f(t) = 
t/T. Performing the integrations in Eqns 7-9 gives a k = 0 and 
b k = {-l) k+1 /k 7 t. The cosine terms are zero, because the function 
is odd {f{t)=-f{-t)), whereas cosine is an even function (/(f) = 
Conversely, if the function were even, the Fourier series 
would include only cosine terms. Adding up the first ten sine 
terms reproduces the ramp reasonably well. If more terms were 
used, the ramp would be reproduced even better. The terms 
with small k are longer-period functions, and so describe the 
long-period features of the time series, whereas those with 
larger k reproduce the shorter-period features. 

We used the Fourier series to express waves on a string as the 
sum of the string’s normal modes (Section 2.2.5). Each normal 
mode has a spatial eigenfunction, which is a Fourier term, 
and an eigenfrequency. The amplitude of each Fourier term 
depends on the source that generated the waves, so different 
waves are represented by differently weighted sums of the 
Fourier terms. For the string the Fourier series described the 
variation of a function in space along a finite string, whereas 
here we use it to describe the variation of a function of time 
over a finite period. Because waves are functions of both time 
and space, Fourier analysis can be used for either variable 
or both. Fourier series are also used in other geophysical ap¬ 
plications to represent functions that vary in space or time 
over finite domains. For example, we used Fourier series to 
describe the temperature fields in cooling oceanic lithosphere 
(Eqn 5.3.19) and in subducting plates (Eqn 5.4.3). 

6.2.2 Complex Fourier series 

The Fourier series (Eqn 1) can be written in a simpler form. 
First, we use the angular frequencies co n = InnIT , expand the 
sine and cosine functions into complex exponentials, and 
regroup terms as 

f(t) = a 0 + 7 X IK - + < a n + ib„)e- ia ’ t ]. (10) 

2 n=1 

Then we use the definitions of the coefficients in Eqns 7-9, 
again expanding the sine and cosine functions into complex 
exponentials: 


772 


(a„ - ib n )/2 


[cos co n t - i sin co n t]f{t) dt 


-772 

772 


f(t) dt 


-77 2 
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T/2 

1 f 

(a n + ib n )!2 = — [cos co n t + i sin <n n t\ f(t ) dt 
* 

-T/2 

T/2 

= J e^mdt. ( 11 ) 

-T/2 

Next, we define 

4 = K-*k) / 2 , -Fo = fl o> and F-„ = K + A„)/2, (12) 

so that the Fourier series becomes 


f(t) = F o + ( 13 ) 

h= 1 n=l 

Because = -2nn!T = and is the complex conjugate 

of F n , ( F_ n = F*), the negative exponentials can be written 



Making these substitutions in Eqn 10 yields the Fourier series 
in complex number form: 



(15) 


( 16 ) 


6.2.3 Fourier transforms 

The complex Fourier series, which represents a function of time 
in terms of a sum over discrete angular frequencies on , can 
be extended into the Fourier transform that represents the 
function as an integral over a continuous range of angular fre¬ 
quencies. Thus, although we used the Fourier series to describe 
the discrete normal modes of a finite string and the earth, we 
use the Fourier transform in most seismological applications, 
because we regard the waves as continuous functions of angu¬ 
lar frequency. 

To do this, we write Eqn 15 as 


f(t)= '£F n e ia -‘An (17) 

H=- °o 

(because An = 1), and define the difference between the success¬ 
ive angular frequencies 


A co= (27t/T)An 


so that 


An-{TAco)/{2K) 


f(t)= J J F n (T/2n;)e i ° 1 « t Aco. 


Next, we let the period T over which f(t) is defined go to infin¬ 
ity, so that the angular frequencies co n become close enough 
that the discrete co n can be replaced by the continuous variable 
(o . As a result, Aco becomes dot), and the sum becomes an 
integral. We assert (note the difference between seismology and 
mathematics texts) that this can be done such that the product 
TF n remains finite and can be replaced by the continuous 
function of angular frequency F{co). The Fourier series (Eqn 20) 
becomes the integral 


F(co)e im do), 


and the expression for the coefficients (Eqn 16) becomes 


m = f(t)e 


Equation 22 is called the Fourier transform , and Eqn 21 is the 
inverse Fourier transform. These can be defined in alternate 
ways by interchanging the signs on the exponentials and pla¬ 
cing the 1/2/r before either integral. 

It may seem strange that by starting with a real function of 
time f(t) we obtain the transform F((o), which is a complex 
function of angular frequency. The idea of negative angular 
frequencies may also seem disturbing. In a sense the two offset 
each other — we obtain a real time function by integrating 
a complex transform over both positive and negative angular 
frequencies. 

An important feature of the transform and inverse transform 
is that their dimensions are different. For example, if f{t) is 
a seismogram that has the dimensions of displacement, its 
transform F(co) has the dimensions of displacement multiplied 
by time (from the dt term). Thus, if f(t) gives ground motion 
in centimeters, F(co) gives the transform of ground motion in 
centimeter-seconds. 

The Fourier transform, a complex-valued function of angu¬ 
lar frequency, can be written in terms of two real-valued func¬ 
tions of angular frequency: 

F(co) = \F(co)\e i ^ a) \ (23) 
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Vanuatu earthquake {M s 6.5) time series 



Time (s) 


Amplitude spectrum 



Frequency (Hz) 

Fig. 6.2-3 Vertical-component seismogram for a moderate-sized (M s = 6.J) earthquake recorded in the South Pacific. The amplitude spectra of the surface 
waves and a portion of the body waves, obtained by transforming different portions of the seismogram into the frequency domain, show that the surface 
waves contain longer-period energy than the body waves. 


where 

\F{co) | = [F{cQ)F*{G))] 1/2 = [Re 1 (F(d))) + Im 2 {F(a))] m (24) 

is called the amplitude spectrum , and 

<j>(a>) - tan -1 (Im (F (<&))/Re {F(co))) (25) 

is the phase spectrum. 1 

Both the amplitude and the phase spectra are needed to 
fully represent the transform, which is also called the complex 
spectrum. In many applications only the amplitude spectrum is 
shown, because it indicates how the energy (the square of the 
amplitude) in the time series depends on frequency. Figure 6.2- 
3 shows a seismogram for a moderate-size earthquake, together 
with amplitude spectra for the body and surface wave portions 


of the seismogram. Looking at the seismogram, we see that the 
surface waves contain longer-period energy than the body 
waves. The spectra demonstrate this: the body wave is domin¬ 
ated by energy with frequencies between 0.1 and 0.08 Hz 
(periods of 10-12 s), whereas the surface wave is dominated 
by energy with frequencies between 0.07 and 0.05 Hz (periods 
of 14_20 s). For comparison, Fig. 6.2-4 shows data for a 
much larger earthquake. The seismogram, from an instrument 
designed to record at long periods, covers seven days after the 
earthquake. The large oscillations with periods of about 
90,000 s are tides within the solid earth. Superimposed on 
these is the signal due to the earthquake. The portion of the 
amplitude spectrum shown indicates the presence of energy 
at long periods (0.002 Hz corresponds to 500 s period). The 
energy is concentrated at discrete peaks, corresponding to the 
earth’s normal modes. 

The Fourier transform F{co) is another way of representing 
the time series f(t). We speak of f{t) as being in the “time 
domain,” and F{co) as being in the “frequency domain.” The 


2 The notations Re and Im indicate the real and imaginary portions of a complex 
number (Section A.2). 
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Bolivian earthquake {M w 8.3) time 



Amplitude spectrum 




Frequency (Hz) 


two representations are equivalent, because we can easily con¬ 
vert data from one domain to the other without losing any 
information. We will see that some methods of analyzing seis- 
mograms are more easily conducted in the frequency domain, 
and that there is a relation between time and frequency domain 
operations. 

The Fourier transform and inverse transform relate a func¬ 
tion of time f(t) and its transform F(co), a function of angular 
frequency. Similar relations apply between other pairs of vari¬ 
ables. In seismology, the other commonly used pair is distance 
and wavenumber. Because the wavenumber is the spatial fre¬ 
quency (Section 2.2.2), it is related to distance in the same 
way that angular frequency is related to time. Hence, there are 
applications in which a double Fourier transform is taken to 
convert a set of seismograms, which describe displacement as 
a function of space and time, into a function of wavenumber 
and frequency (Section 3.3.5). A triple Fourier transform can 
similarly be taken for data in two space dimensions and time. 

6.2.4 Properties of Fourier transforms 

The Fourier transform has a number of interesting properties 
that we often use, whose proofs are left for the problems. 

(1) The Fourier transform is linear: if F(co) and G(co) are 
the transforms of f(t) and g(t), then (aF(co) + bG(co)) is the 
transform of (af(t) + bg{t)). This property makes the Fourier 
transform useful in filtering, because it permits us to treat a 


signal as the sum of several signals, knowing that the transform 
will be the sum of their transforms. 

(2) The Fourier transform of a purely real time function has 
the symmetry 

F(-co) = F*(co). {26) 

Thus for seismograms (which are real because the motion of 
the ground is purely real), the values of the transform for the 
negative frequencies can be found from those for positive 
frequencies. Hence, in filtering seismograms, we can operate on 
only the positive frequencies and compute the value of the 
transform at the negative frequencies by taking the conjugate, 
thus saving computer time and storage space. 

(3) The Fourier transform of a time series shifted in time is 
found by changing the phase of the transform: if the transform 
°f M is F {(D), the transform of f(t-a) is e~ i(0a F(co). In analyzing 
seismograms it is arbitrary what time we choose as the origin; 
the amplitude spectrum stays the same, and the phase changes 
in a simple way. This makes sense, because in the absence of 
attenuation a wave keeps its shape but changes in phase as 
it propagates. Similarly, shifting a Fourier transform in fre¬ 
quency causes a phase change in the corresponding time series: 
the inverse transform of F(co - a) is e iat f(t). These relations are 
sometimes called shift theorems. 

(4) The Fourier transform of the derivative of a time func¬ 
tion is found by multiplication: (ico)F(co) is the transform of 
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a\ t )ldt. Similarly, {ico) n F(co) is the transform of d n f{t)ldt n . 
This makes differentiation easy on a computer, and is an easy 
■ iv to change a displacement record into velocity, or veloc- 
'' ^into acceleration. This property also makes it easy to solve 
'lifferential equations (e.g., Eqn 3.7.8) using the Fourier trans¬ 
form, an approach that is often posed as using a sinusoidal 
criil solution. Hence we sometimes write and operate on the 
wave equation using the Fourier transform of the wave field 
,;£qns 2.2.34,3.3.74). . , 

(5) The total energy in a Fourier transform is the same as that 

in the time series: 



f 0 - 3(7 to — o to + cr 


f(t)\ 2 dt 


\F{w)\ 2 da i, 


a relation known as Farseval’s theorem. This relation arises 
because the time series and its Fourier transform are equivalent 
representations. 

6.2.5 Delta functions 

In using Fourier transforms, we often need to describe a signal 
that is concentrated at a single time or frequency. This is done 
using the Dirac delta function, an entity that is not truly a func¬ 
tion, but rather a generalized function that is the limit of a 
sequence of continuous functions. The delta function can be 
defined in several ways, each of which offers a different insight 
into its nature. 

A delta function at t = £ 0 , written 5{t — £ 0 ), is defined as the 
limit of a Gaussian function that keeps the area constant (= 1) 
as the width (a) narrows and the height, l/cr^/2^, increases 
(Fig. 6.2-5): 


■8{t - t 0 ) = lim 


Thus the Dirac delta function is a continuous function ana¬ 
logous to the Kronecker delta symbol, 8- (Eqn A.3.37) which is 
a function of two discrete variables, i and/. An alternative defi¬ 
nition comes from defining the delta function by how it behaves 
when integrated, a property called “sifting.” This is defined as 


Step function H(t - t 0 ) 



Fig. 6.2-5 Definitions of a delta function at t = t 0 . Top : 5{t -1 0 ) is the limit 
of a Gaussian function with width <x The area stays equal to 1 as the width 
narrows and the height increases. Bottom : 8{t- 1 0 ) is the derivative of a 
step function H(t - 1 0 ) at time t = t 0 , which is zero at all times except near 
t 0 , when it goes to infinity. 


the derivative of the step, because it is zero except at when 
it goes to infinity. Because the delta function is located where 
its argument is zero, S(t 0 -1) is at time t 0 , whereas 8{t+ t 0 ) is at 
time-^ 0 . 

To find the Fourier transform of the delta function, we use 
the definition of the transform (Eqn 22) with f(t) = 8(t-t 0 ). 


F(co)= J 8(t-~t 0 )e- im dt=e- m \ 

and evaluate the integral by the sifting property (Eqn 29). If the 
delta function is at time zero, 


f(t)8{t-t 0 ) dt. 


Thus the delta function at t - f, 


tion at time t c 


(29) F(co)= 8(t)e- im dt=l. 


sifts out” the value of a func- Similarly, for a delta function at t = t 0 , the amplitude spectrum 


if it is multiplied by the function and integrated (Eqn 24) is also 


overalltime. . i = l 

A third definition comes from considering a step, or e ) 

Heaviside, function H(t -1 0 ) that is 0 for time before t = t 0 and 

equal to 1 afterwards (Fig. 6.2-5). The delta function 8(t-t 0 ) is but the phase spectrum (Eqn 25) 
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Fig. 6.2-6 The Fourier transform of a delta 
function, 6{t~ f 0 ), is e~ iat o. Its amplitude 
spectrum has unit amplitude at all 
frequencies, and its phase spectrum 
has a slope of-t 0 . 


$(CO)=-Q)t 0 , (33) 

as shown in Fig. 6.2-6. This example illustrates one of the 
Fourier transform properties noted in Section 6.2.4, that shift¬ 
ing a function by a time t 0 changes its transform by e~ io)t °. 

The delta function’s amplitude spectrum has unit amplitude 
at all frequencies. Another way to see this is to write the inverse 
transform, using Eqn 21, 

f r 

f(t) = -- e- im °e ia>t d<» = — e ,ai ‘-^dco = S(t-t 0 ), (34) 

2 7t J 2% 

which shows that the delta function is an integral or sum of 
sinusoids of all frequencies. These are in phase only at time t 0 , 
giving a large amplitude, and are out of phase at all other times, 
giving a zero amplitude (Fig. 6.2-7). 



Time 

Fig. 6.2-7 Because the Fourier transform of a delta function has unit 
amplitude at all frequencies, it corresponds to the sum of sinusoids of all 
frequencies. These are in phase only at time t Q , giving a large amplitude, 
and are out of phase at all other times, giving zero amplitude. In this 
example, five sinusoids (dashed lines a-e) with unit amplitude (cos [(2 n + 
1)T“ t 0 )]) are summed (solid line), giving a peak of amplitude 5 at t Q . 


Although so far we have discussed delta functions only in 
the time domain, they are also useful in the frequency domain. 
The properties of the frequency domain delta functions are 
analogous to those in the time domain. A delta function at 
angular frequency co Q , S(co- co 0 ), has an inverse transform of 


f(t) - — 8 ( 0 )- co^e^dw 


Thus we can express the delta function in terms of its Fourier 
transform, 


S(w -m 0 ) = — e’^e-w'dt = — e i[w »~ a) ‘dt, (36) 

2ti Ik 


showing that it is the integral, or sum, of sinusoids that are in 
phase only at frequency eo 0 . 

Delta functions in angular frequency give the spectra of 
sinusoids with a single frequency. For example, a cosine with 
frequency co 0 , given by 

f(t) = cos Q) 0 t = (e ico ° t + e-«V)/2, ( 37 ) 

has a Fourier transform of 


F(a>) = — [e ia, o f + e~ lC0 ^] e~ im dt = — e -nco 0 +<o)t j 

2 2 


By Eqn 36, this is the sum of two delta functions in the fre¬ 
quency domain, 

F(co) = k[8(g>-co 0 ) + 8 ( 0 )+ © 0 )]. (39) 


Thus the amplitude spectrum of the cosine time function in 
Eqn 37 consists of two delta functions, one at co Q and one at 
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_ 0 ) 0 . If the time function were a sine rather than a cosine, the 
amplitude spectrum would be the same, but the phase spectrum 
would be different. Given the relation between the transforms 
of functions shifted in time discussed in the previous section, 
this makes sense, because a sine function is a time-shifted 
cosine, and vice versa. 

This example illustrates one of the reasons for using Fourier 
transforms. The frequency domain description of the function 
is simpler, because a large number of points are needed to 
accurately describe the cosine as a function of time, but only 
two complex numbers, the values of the transforms at ±cy Q , are 
needed to describe it as a function of frequency. Time series 
more complicated than a pure cosine are often more easily 
described in the frequency domain, and processes that act on 
the time series are also often more easily represented in the 
frequency domain. In such cases, it is common to work in the 
frequency domain and then use the inverse transform to 
generate the final time series. 

6.3 Linear systems 

Among the uses of Fourier analysis in seismology is modeling 
different factors affecting a seismogram. First, a seismogram 
is a record of ground motion that includes the effect of the 
seismometer. Furthermore, the ground motion combines the 
effects of the seismic source and the elastic and anelastic earth 
structure along the propagation path (Section 4.3). To charac¬ 
terize the combined effects of these different factors, we use the 
idea of a linear system , a general representation of any device 
or process that takes an input signal and modifies it. This repre¬ 
sentation treats these processes as mathematical operators 
transforming an input signal into an output signal. 

6.3.1 Basic model 

A linear system is one in which if input signals x x (t) and x 2 {t) 
produce output signals y^t) and y 2 {t ), the combined input 
(Aaqf*) + Bx 2 (t)) yields (Ay^t) + By 2 (t)) (Fig. 6.3-1). We have 
previously referred to this feature as the principle of super¬ 
position. Fortunately, the earth generally behaves this way in 
transmitting seismic waves. As a result, linear system models 
are used in a wide variety of seismological applications. Fourier 
analysis is a natural tool for studying linear systems because the 
Fourier transform has these same linear properties (Section 6.2.4). 

We characterize a linear system by its response to an impul¬ 
sive delta function in time (Fig. 6.3-2). This impulse response 
f(t) can be used to find the response of the system to an 
arbitrary input signal. Viewed in the frequency domain, the 
impulse , whose spectral amplitude is equal to 1 at all frequen¬ 
cies, gives rise to an output F(m), which is the transform of 
the impulse response, sometimes called the transfer function. 
Thus, if the input signal is an arbitrary signal x{t), with 
transform X(w), the resulting output spectrum is just the input 
spectrum times the spectrum of the impulse response, 



,(t) + By 2 (t) 


Fig. 6.3-1 Definition of a linear system. 


Impulse 

Linear system: 

t fit) Impulse response; 

8(t) 

response fit) 

F(o) Transfer function 
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Fig. 6.3-2 Characterization of a linear system by its impulse response f(t) 
and transfer function F{<o). 

Y{0)) = X{CO)F{CQ). (1) 

Because the transforms are generally complex numbers, the 
phase as well as the amplitude of the input signal is usually 
modified. 

The output in the time domain y(t) can be found by inverting 
the transform, 


y{t)= — X{co)F(co)e iCOt dQ). 
2 n 


To see that this works, note that for the impulse x{t) = S(t ), 
X(co) = 1, and y{t) =f{t). This equation gives another way to 
think of the impulse response. For a harmonic input signal of 
unit amplitude e iC0a f whose transform is the delta function in 
frequency 

X{(D) = 27t8{(O-co 0 ), (3) 

the output is 


2kS{ co-co 0 )F{ co)e mt dcD=F{ co 0 )e t 


a harmonic signal of the same frequency with the amplitude of 
the transfer function at that frequency. 

It is interesting to consider the relation between the input 
time function, the impulse response, and the output time 
function. To do this, we expand Eqn 2 by writing out the 
transforms of X(co) and F(cd), 


x{T)e- i(or dt f{T')e~ ia)tf dT' e im dco , {5] 
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Bandpass filter transfer function 



- 0.2 0 

Frequency (Hz) 

Impulse response 



Fig. 6.3-3 A simple bandpass filter specified in the 
frequency {top) and time ( bottom ) domains. 


and regrouping terms, 


y(t) 


x(T)f(T') 


2 ;71 


e m-T'~T) dco 


drdr' 


( 6 ) 


Using the inverse transform of the delta function (Eqn 6.2.34), 


2k 


e im{t-x'-x) dQ) = $( t _ _ T)j 


(7) 


we eliminate the frequency integral and obtain 


y{t) = 

[ x(r) 

f 

-o 

o 

— OO 


f(r')8(t - t' - T)<fr' 


dr. 


( 8 ) 


Finally, carrying out the inner integration using the sifting 
property of the delta function (Eqn 6.2.29) yields 


y(t) 


x{T)f(t-T)dz. 


(9) 


This integral operation, known as the convolution of the 
functions x(t) and f(t), is often written as 


y(t)=x(t)*f{t). 


( 10 ) 


The output of a linear system is thus the convolution of the input 
signal and the impulse response. Comparison of Eqns 10 and 
1 shows the relation between operations in the two domains: 
convolution in the time domain corresponds to multiplication 


in the frequency domain. The reverse is also true: frequency 
domain convolution corresponds to time domain multiplication. 

We thus have two different ways of implementing any opera¬ 
tion that can be characterized by a linear system. The effect 
that the system has on an input signal is specified either by the 
impulse response in the time domain or by its transform, the 
transfer function in the frequency domain. For example, to 
filter a seismogram so that only a certain range of frequencies 
remains, we can filter in either the frequency or time domains. 
To do this in the frequency domain, we can define a simple 
bandpass filter , a function which is 1 in the frequency range 
of interest and 0 for all other frequencies. Figure 6.3-3 [top] 
shows the amplitude spectrum of the filter, whose phase spec¬ 
trum is defined as zero for all frequencies. To perform the 
filtering, we multiply this function by the Fourier transform 
of the seismogram, point by point for all frequencies, and 
take the inverse transform of the result. The resulting filtered 
seismogram has only the desired frequencies. Alternatively, 
however, we could find the impulse response of the bandpass 
filter by taking the inverse Fourier transform of the amplitude 
spectrum in the top of Fig. 6.3-3, and filter the data by con¬ 
volving this impulse response (Fig. 6.3-3, bottom) with the 
seismogram in the time domain. 

A few points about this simple filter are worth noting. First, 
although it is typical to plot the transfer function only for 
the positive frequencies, the filter is also defined for negative 
frequencies, to ensure that the resulting signal is real (Sec¬ 
tion 6.2.4). Second, the peculiar appearance of the impulse 
response makes sense when we recall that the impulse response 
describes what comes out of the filter when a delta function 
comes in (Fig. 6.3-2). The delta function’s amplitude spectrum 
is constant for all frequencies, but only some of these frequen¬ 
cies are transmitted through the filter. The lack of high frequen¬ 
cies is particularly noticeable, and results in the noncausal 
impulse response beginning before time zero. We noted a simi¬ 
lar phenomenon in Section 3.7.8, where anelasticity acted as a 
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m 

, «t)*x(t); r 

g(t ) 

F(co) 

F(e))X(co) 

G(ffl) 


f{t) * x(t) * g(t) 
F{(D)X{(o)G{a>) 


fig. 6.3-4 When a signal goes through two linear systems in succession, 
the net output is the convolution of the impulse responses in the time 
domain, or the product of the transfer functions in the frequency domain. 


filter, removing high frequencies and thus making the wave¬ 
forms noncausal unless the effects of physical dispersion were 
included. Third, this filter has sharp “corners” at the edges of 
the passband, although in real applications the corners are 
smoothed for reasons we discuss shortly. 

Because the same effect can be achieved by either time do¬ 
main or frequency domain filtering, the choice of domain can 
be made for convenience. Surprisingly, the operations of taking 
transforms and inverse transforms are sufficiently fast in com¬ 
putation that it generally makes sense to filter in the frequency 
domain. An attraction of this method is that filters are usually 
easier to specify in the frequency domain, because it is clear 
which are the desired and undesired parts of the signal. For ex¬ 
ample, in Fig. 6.3-3 ( bottom ), the corresponding time domain 
filter is difficult to visualize intuitively. Similarly, the transfer 
function, or instrument response, of a seismometer is more 
easily specified in the frequency domain, as we will discuss in 
Section 6.6. 


6.3.2 Convolution and deconvolution modeling 

Linear system ideas are so pervasive in seismology that we 
discussed them in applications such as reflection seismology 
(Section 3.3.6) and earthquake source studies (Section 4.3) 
before we justified them mathematically. One reason why these 
models are so useful is that they are easily generalized to mul¬ 
tiple linear systems, so quite complicated physical effects can be 
described. Specifically, if a signal x{t) goes through two linear 
systems in succession (Fig. 6.3-4), with impulse responses f(t) 
and g(t), the net output is either a convolution in the time domain, 

y(t) = x(t)*f{t)*g(t) 9 (11) 

or the product of the transfer functions in the frequency 
domain 

Y(co)=X{co)F{cd)G{co). (12) 

We can extend this to an arbitrary number of linear systems. 

A common application is to think of a seismogram as the 
output resulting from sending a source signal through a set of 
linear systems. In the simplest case, the seismogram u(t) can be 
written in terms of three basic effects, 

u{t)-x(t)*g(t)*i(t), (13) 

where x{t) is the source signal, g(t) is the response of an 
operator representing the effects of earth structure along the 


Source Structure Instrument 

x(t) g(t) i(t ) 



Seismogram 

u(t) 



Fig. 6.3-5 A seismogram can be modeled as the convolution of the source 
signal with operators representing the effects of earth structure and 
the seismometer. This can be done in the time domain as a set of 
convolutions, u(t)=x{t) *g{t) * i{t), or in the frequency domain as a set 
of multiplications, U{o})^X(co)G(co)l(o)). (After Chung and Kanamori, 
1980. Phys. Earth Planet. Inter., 23, 134-59, with permission from 
Elsevier Science.) 



10 5 10 4 10 3 10 2 10 1 


Period (s) 

Fig. 6.3-6 Transfer functions for various seismometers, some of which are 
discussed in Section 6.6. SRO is the Seismic Research Observatory, IDA 
is International Deployment of Accelerometers, VLP is Very Long Period, 
and BRB is Broadband. Transfer functions are the frequency domain 
equivalents of the time domain instrument response shown in Fig. 6.3-5 
as i{t). 


path of the seismic waves, and i(t) is the impulse response of the 
seismometer. 

Figure 6.3-5 shows a simple example: a seismogram result¬ 
ing from the convolution of a trapezoidal source function 
representing the signal emitted by an earthquake with oper¬ 
ators giving the effects of earth structure and the seismometer. 
Each operator can be specified in either domain. For example, 
the time domain impulse response of a seismometer reflects the 
fact that its transfer function depends on frequency (Fig. 6.3-6). 
Once the different effects are characterized by their response 
in the time or frequency domain, the seismogram due to their 
combined effects can be obtained. 

Convolution can be used to describe the response of a system 
in space as well as time. For example, probabilistic earthquake 
hazard maps like Fig. 1.2-3 can be viewed as two-dimensional 
convolutions in space of an assumed distribution of earthquake 
sources with an impulse response like Fig. 1.2-5 giving the 
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expected ground motion as a function of earthquake magni¬ 
tude and distance. 

Often the impulse response is defined in both space and time. 
This is the basic approach used to find the response of the earth 
to a seismic source (Chapter 4). The displacement at a point x 
and time tis 


u(x, t) = G(x-x'; t-t') f(x\ t')dt'dV', 


(14) 


where G(x - x'; t-t') is the Green’s function , a the impulse 
response to a source at position x' and time t', and f(x\ t') is 
the distribution of seismic sources. Thus the integral gives the 
total response due to the distribution of sources. In most cases 
the source is limited in space and time, so the integral is done 
over the source region. Often the source is at a point in space or 
time, so f(x ', t') contains delta functions and is easily integ¬ 
rated using the sifting property. A nice feature of this formula¬ 
tion is that the principle of reciprocity, which says that the 
source and the receiver can be interchanged, emerges directly. 
The Green’s function in Eqn 14 is for a laterally homogene¬ 
ous medium, so the response depends only on the distance 
between the source and the receiver. In a general medium 
Eqn 14 becomes 


«(x, t) = 


G(x, t; x', t') f{x\ t') dt'dV'. 


(15) 


When a system is described by a convolution, we can exam¬ 
ine the effects of the different contributing factors using 
deconvolution. We start with the output and one of the time 
series that were convolved to form it, and then find the other. 
For example, in Section 3.3.6 we discussed using seismic reflec¬ 
tion data to obtain the sharpest resolution of reflectors in the 
earth. We assumed that a seismogram s(t) results from con¬ 
volution of a source pulse, or wavelet, w(t), and an earth struc¬ 
ture operator, r{t). r(t ), known as a reflector series, is presumed 
to be a set of delta functions with positions corresponding to 
the travel time for a reflection from an interface and amplitudes 
corresponding to the amplitude of the reflected arrival. Thus 

$(t)-w(t) * r(t) and S(co) = W(co)R(co). (16) 

If the travel time differences between the arrivals corres¬ 
ponding to individual reflectors are shorter than the duration 
of the wavelet, interference can occur, giving a complicated 
signal. Hence it would be desirable to have a delta function 
source wavelet whose Fourier transform is simply 1, so that 
the seismogram would equal the reflector series. Although a 
physical source wavelet is not a delta function, we simulate 


such a wavelet by creating an inverse filter 2 w~ l (t), which, when 
convolved with the wavelet, yields a delta function: 

w~ l (t) * w(t) = 5(t). (17) 

As we saw in Section 3.3.6, the Fourier transform of the inverse 
filter is just 1/W(<£>), so deconvolution can be done by dividing 
the Fourier transforms 

S(co)fW(co) = R(co). (18) 

This sometimes works well, but can be problematic at frequen¬ 
cies where the source wavelet spectrum W(oo) is small (causing 
R{co) to go to infinity), so a minimum amplitude threshold can 
be set. 

As an alternative, inverse filters can be designed in the time 
domain to compress the source wavelet into a function as close 
to a delta function as possible. This approach is a special case of 
the general problem of finding a shaping filter that converts a 
given input into a given output. We will shortly discuss another 
approach, which relies not on the convolution, but on the 
related cross-correlation operator. 

Deconvolution is also used in other applications. A con¬ 
ceptually similar one is modeling seismograms from a distant 
earthquake as a sum of secondary arrivals generated when the 
upcoming wave encounters interfaces below the receiver 
(Fig. 6.3-7). The vertical component is assumed to represent 
the direct arrival, and is used as a Green’s function that is 
deconvolved from a horizontal component to find a receiver 
function characterizing the structure. The receiver function 
corresponds to the reflector series in this geometry. Another 
application of deconvolution is to take seismograms and de¬ 
convolve the effects of the seismometer to find the true ground 
motion, or deconvolve a seismogram to try to find the source 
pulse due to an earthquake (Section 4.3.3). 

6.3.3 Finite length signals 

We have seen that the Fourier transform describes a signal as 
the sum of harmonic signals with different frequencies. One 
important limitation is that the Fourier transform requires inte¬ 
gration over all time. In reality, we only have data over a finite 
interval of time. 

To see how this affects our results, consider a window func¬ 
tion b(t) which selects part of the data. Its effect on the data f( t) 
is represented by multiplying f(t) by b(t). We then ask how the 
Fourier transform of the function, including the effect of the 
window 


G(a>) = 


b(t)f(t)e~ im dt , 


( 19 ) 


1 The same entity is commonly termed a Green’s function in physical problems and ^ related to the transform of the Original function, F( CO ). 
an impulse response in time series analysis. In seismology the terms are used essen¬ 
tially interchangeably. 2 The notation w~ l {t) does not mean l/w{t). 
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P 



Fig. 6.3-7 Schematic diagram of the receiver function approach. The 
receiver function, derived by deconvolving the vertical component from a 
horizontal component, should have arrivals corresponding to the times 
of seismic wave phases generated when the upcoming wave encounters 
interfaces below the receiver and amplitudes reflecting the amplitudes 
of these waves. The receiver function can be used to study the depths 
of the interfaces and the velocity contrast there. Because a horizontal 
component is used, the phases predicted involve P-to-S conversions and 
their reverberations, as described by the nomenclature used to identify 
phases (e.g., PpPms). Owens etal., 1987. © Seismological Society of 
America. All rights reserved.) 


This question can be answered by writing b{t) and f{t) using 
their inverse transforms, 
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( 20 ) 


recognizing that the inner integral is the Fourier transform of a 
delta function in frequency (Eqn 6.2.36), 


G{co) = — | B{co') 
2k 


F{co")8{co - co' - co") dm" 


dco\ ( 21 ) 


and using the sifting property (Eqn 6.2.29) to obtain 


G(co) = 


2k 


B{co')F{co-co')dco'- — B(o)*F(co). (22) 

2k 


Thus the effect of multiplying a time series by a window func¬ 
tion is that the spectrum of the time series is convolved with the 
spectrum of the window function. This is an example of the 
fact that just as convolution in the time domain corresponds 
to multiplication in the frequency domain, so multiplication in 
the time domain corresponds to convolution in the frequency 
domain. 

To see the effect of windowing on the spectrum, consider the 
simplest window function, a “boxcar” which describes taking 
only the data in a certain time interval (Fig. 6.3-8), 

b(t) = 1 for-T<?<T, 

= 0 otherwise. (23) 


Its Fourier transform is 


B{co) = [ e~ im dt 


-T 


ICO 


2 sin coT 2 T sin coT 


CO 


coT 


(24) 


whose amplitude spectrum \B{co)\ has a characteristic shape 
with a central lobe and smaller side lobes, and equals zero 
where x = coT = 2 nK. The width of the central lobe is 2 kIT. 
This | (sin x)/x | curve, sometimes called a sine function, is 
convolved with, and thus modifies, the spectrum \F{co) |. 

For example, if f(t) is a sine wave (Fig. 6.3-9a) whose ampli¬ 
tude spectrum is described by two delta functions, convolution 
with B(co) yields the spectrum of a finite length sine wave, two 
sine functions. Thus, taking a finite length of record “smears” 
the delta functions of the infinite length record’s spectrum into 
broader peaks with side lobes (Figs 6.3-9b). Taking longer 
records (increasing T) yields sharper spectra (more like the 
delta function), because the width of the central lobe of the sine 
function is proportional to 1/T. 

This effect has an important consequence for analyzing 
signals containing different frequencies, as shown in Fig. 6.3-9c 
for a time series with two frequencies. For shorter record 
lengths (Figs 6.3-9d and e), the spectral peaks broaden until 
they start to overlap and cannot be resolved separately. Once 
the width in frequency of the central lobe of the sine func¬ 
tion exceeds the separation between the two spectral peaks 
(Figs 6.3-9e), they cannot be resolved. Thus the frequency 
resolution, the minimum separation in frequency for which 
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Fig. 6.3-8 Time and frequency domain 
representations of the simplest window 
function, a “boxcar” that selects only the 
data in a certain time interval {left). The 
amplitude spectrum {right) has a central 
peak and smaller side lobes. 


Data length and frequency resolution 



Fig. 6.3-9 Effects of finite data length on the spectrum. The spectrum 
of the sine wave in (a) is “smeared” by taking a short data window (b). 
For a time series with two frequencies (c), shorter record lengths cause 
the spectral peaks to broaden (d) until they start to overlap and cannot be 
resolved separately (e). 


two peaks can be resolved, is proportional to the reciprocal of 
the record length. 

This relation between signals in the time and frequency 
domains demonstrates a fundamental principle. By taking a 
finite length portion of a time function, we broaden and distort 
its spectrum in a predictable way. The reverse occurs in the fre¬ 
quency domain; taking a finite portion of the spectrum distorts 
the time function, as we discussed in considering Fig. 6.3-3. 
For example, because a seismometer only responds to ground 
motion in a certain frequency range, the resulting seismogram 
is a somewhat distorted record of the ground motion. Sim¬ 
ilarly, physical processes like anelasticity (Section 3.7.8) and 
diffraction (Section 2.5.10) that remove high frequencies dis¬ 
tort the resulting waveforms. 

Thus we have an “uncertainty principle” that the product of 
the “widths” in the two domains is constant; for a time domain 
record with duration T, the resolution in the frequency domain 
is proportional to 1/T, Perfect resolution in frequency requires 
infinite record length in time, and infinite bandwidth in fre¬ 
quency is needed to represent a time function exactly. These 
properties are general features of Fourier transform pairs, so 
also apply to distance and wavenumber. 3 

The sine or |sinx/x| function, which we used to represent 
taking a finite portion of a time series, appears in other similar 
applications. We saw that diffraction through a slit, in which 
only part of a wave front is transmitted, is described by a sine 
function (Fig. 2.5-18). The sine function also describes the 
spectrum of waves radiated from a finite fault (Section 4.6.2). 

In real cases, we do not have infinite lengths of data. More¬ 
over, it is not always desirable to take more data. For example, 
the signal of interest on a seismogram eventually decays into 
the noise due to attenuation, or is interferred with by a different 
signal. We seek the best resolution of the spectrum of the signal 
of interest, but as the record length increases, the noise has a 
greater effect and increasingly contaminates the spectrum. We 
thus select a compromise record length and try to obtain the 
best spectrum. This issue arises in estimating seismic attenua¬ 
tion, which broadens spectral peaks (Section 3.7.7) in a way 

3 The uncertainty principle also appears in quantum physics, where the position and 
momentum of a particle form a Fourier transform pair. Thus, the better we know a 
particle’s position, the less we know about its momentum, and vice versa. 
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similar to that of finite record length. Longer records broaden 
the peaks less, and so give better estimates of attenuation up to 
the point where the effects of noise degrade the estimates. 

Though we can never get around the problem of finite record 
length, it can be ameliorated by using a different window func¬ 
tion than a boxcar. A window function whose “corners” are 
less “sharp,” known as a taper , reduces the size of the side lobes 
and thus the distortion. One simple such function, a cosine 
taper, is a boxcar with smoother ends: 


W(t) = - l + cos n{t + T Tl> for-T<t<-T + T 1 

! 2 T, 


for -T +T 1 <f<T-T 1 


1 n{t-T + T t ) t „ 

= — 1 + cos-l-- for T-T-,<t<T 


for other times. (25’ 


The parameter T t is the tapered fraction of the half-length T. 
Figure 6.3-10 illustrates the effect of tapering data, by compar¬ 
ing the spectra of two windows of the same length. The side 
lobes for the tapered window are reduced. 

Such a taper is often applied in the time domain to data, with 
T/T « 0.1, before taking spectra. Similarly, bandpass filters 
are often tapered in the frequency domain. In the frequency 
domain, a pure bandpass filter is two boxcar functions for the 
positive and negative frequencies in the passband (Fig. 6.3-3). 
The corresponding inverse transform thus looks like a sine 
function, and causes “ringing,” analogous to the side lobes, in 
the time domain. The ringing can be reduced by tapering the 
response at the edges of the passbands. For the same reason, 
the spectrum of a theoretical (synthetic) seismogram computed 
in the frequency domain is tapered before the inverse Fourier 
transform is used to produce a synthetic seismogram in the time 
domain. 

This example brings out the general point that, in filtering 
data, we make certain choices depending on our goals and 
accept the consequences. There are no absolute criteria for 
what is best. For example, tapering a filter in the frequency 
domain reduces the ringing that can produce spurious non- 
causal arrivals, at the price of distorting the spectrum and 
waveform. We will see in Section 6.6.5 that this issue appears 
in designing digital seismometers. 


6.3.4 Correlation 

Often we want to measure how similar two signals are. A com¬ 
mon application is identifying a reflected arrival by finding the 
portion of a seismogram that most resembles a direct arrival or 
a function that we believe represents the source. To do this, 
we define the part of the signal we seek to identify as f{t), the 
remaining portion of the seismogram as x{t ), and form the 
integral 



Fig. 6.3-10 Comparison of the spectra of two windows of the same 
length. The side lobes for the tapered window are reduced, but the central 
peak is less sharp. 

Til 

C(L) = lim — x{t)f(t + L)dt. (26) 

T-*~T J 

-772 

C(L), the cross-correlation of x{t) and f{t), measures the sim¬ 
ilarity between f(t) and later portions of x{t) by shifting f(t) by 
different lag times , L, and evaluating the integral of the product 
as a function of L. The lag for which C(L) is maximum is the 
time shift that makes the two functions most similar. Although 
T formally goes to infinity, we set T to an appropriate value, 
because the data exist only in a finite time range. Thus the 1/T 
factor is a normalization, which is often neglected. Cross¬ 
correlation and convolution are similar operations, the major 
difference being the sign of the time shift. 

Figure 6.3-11 shows an example of applying cross-correlation 
to determine the travel time difference between direct S and 
SS phases. The 55 phase should be similar to 5, once 5 is cor¬ 
rected to include the effects of the additional attenuation on 
the longer ray path and the nil phase shift due to the surface 
reflection (Section 3.5.1). Direct 5 is selected on the seismo¬ 
gram, corrected, and then cross-correlated with the rest of 
the seismogram. The peak in the cross-correlation gives the lag 
that measures the arrival time difference between the two 
phases. Another application of cross-correlation is in explora¬ 
tion seismology, where an assumed Vibroseis source signal is 
cross-correlated with seismograms, giving peaks at times when 
reflections occur (Section 3.3.6). In these applications, the 
cross-correlation is being used to identify reflections, much as 
could be done by deconvolution, because the cross-correlation 
is similar to the convolution. 

A special case of the cross-correlation is the auto-correlation, 
the cross-correlation of a time series with itself 
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Fig. 6.3-12 Illustration, for a boxcar function, that the auto-correlation is 
maximum at zero lag and is an even function of the lag. 


The auto-correlation is significant in the theory of filtering 
because it is related to the amplitude spectrum. To see this, 
consider a function f(t) that is zero except between -T/2 and 
Tl 2. The auto-correlation 


R(L) = fim — f(t)f(t + L)dt 


can be expanded using the inverse Fourier transform and usin^ 
the time shift theorem (Section 6.2.4), 


R(L) = lim- f(t) F((o)e i(0 ^dco dt 

t->°° 2kT 


lim- F{o))e icoL f(t)e im dt dco 

t-^oo 2kT 


lim- F(o))F(-co)e icoL dco 

t-»°° 2kT 


Fig. 6.3-11 Application of the cross-correlation to determine the travel 
time difference between direct S and reflected SS phases on a seismogram 
(a). The direct S phase (dashed line in (b)) is corrected for attenuation 
(solid line in (b)), phase-shifted (c), and then cross-correlated with the rest 
of the seismogram (d). The peak in the cross-correlation gives the lag that 
measures the arrival time difference between the two phases. (Kuo etal, 
1987./. Geopbys. Res., 92, 6421-36, copyright by the American 
Geophysical Union.) 


= lim —G \F(co)\ 2 e ia,L dw, (29) 

T-*oo 2n 1 

where the last step uses the fact that F(-co) = F*(oo). Thus, if 
we define the power spectrum , a normalized version of the 
amplitude spectrum, 



Not surprisingly, the auto-correlation is maximum at zero lag 
and is an even function of the lag (Figs 6.3-12 and 3.3-30). 
When the cross-correlation is used to identify reflections 
(Figs 6.3-11 and 3.3-31), it makes the seismogram look like the 
auto-correlation of the signal near the reflection. 


P(co)= lim |P(cu) I 2 , (30) 

T->oo J 

we see that the auto-correlation is the inverse Fourier transform 
of the power spectrum: 








Fig. 6.3-13 Illustration showing that a function has the same 
auto-correlation if it is reversed in time. 


As a result, the auto-correlation of a function contains informa¬ 
tion only about its amplitude spectrum, but not about its phase. 
Functions with the same amplitude spectrum but different 
phase spectra have the same auto-correlation. For example, a 
function has the same auto-correlation if it is reversed in time 
(Fig. 6.3-13). 

6.4 Discrete time series and transforms 

The analysis of seismic data using Fourier transforms requires 
computers. Thus the ground motion, a continuous function of 
time, is represented by a signal consisting of the ground motion 
measured, or sampled , at discrete points in time. Early seismo¬ 
meters, which recorded on paper wrapped around a rotating 
drum, yielded continuous analog seismograms which were 
digitized to create a discretized seismogram. Modern seismo¬ 
meters typically record the ground motion as a set of amplitude 
values measured repeatedly over a constant interval, such as 
40 times per second (40 sps, “samples per second”). To work 
with digitized seismograms, the transforms and other math¬ 
ematical operations that we formulated in Section 6.3 as con¬ 
tinuous functions of time are replaced by discretized versions. 
Working with the discretized data is the subject of digital signal 
processing, whose basic ideas we discuss next. 

6.4.1 Sampling of continuous data 

The operation of sampling a signal at intervals A t can be repre¬ 
sented by multiplying the signal by a series of delta functions 
(Section 6.2.5) in time spaced At apart, called a Dirac comb or 
Shah function (Fig. 6.4-1): 

V(f,At)= X 8{t-nAt). (1) 





Fig. 6.4-1 Sampling a signal at intervals At {top) is described by 
multiplying the signal by a series of delta functions that are spaced At 
apart in time [center), called a Dirac comb. The transform of a Dirac 
comb spaced at At in time is a comb spaced 2nlAt in angular frequency 
{bottom). 

To see what this does to the spectrum of the signal being 
sampled, consider the Fourier transform of the Dirac comb, 


V{t;At)e- im dt = 


X 8{t-nAt)e- im dt= X e~ i(0nM , (2) 


which was evaluated using the sifting property of the delta 
function (Eqn 6.2.29). It turns out that although the Fourier 
transform of a single delta function is a complex exponential, 
the transform of a Dirac comb is another Dirac comb. To see 
this, note that because V(£; At) is periodic with period At, it can 
be expanded in a complex Fourier series (Section 6.2.2), 


V(£;A£)= X F m ei0)n,t for 0) m = ImnlAt, (3) 

whose coefficients are given by 
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Because in the interval (-At/2, At/2) only one delta function, 
S(t-O), occurs, the Fourier coefficients are 

At/1 

r 

F m =— 8(t)e~ m,,t dt = — e 1(Om0 ~ —; (5) 

At J At At 

-Mil 

so the Fourier series for the Dirac comb is 

V(f;Af) = — V e i2mxil&t' ( 6 ) 

Now, consider a Dirac comb in the frequency domain, V(co; 
2k! At), which consists of delta functions spaced 27t/At apart 
in angular frequency, 

V(g); 2k/ At) = Y 8(co-n2n/At). (7) 

n=-oo 

Its inverse transform can be evaluated using the sifting property 
to yield 



V(co; 27i/At)e i(0t dco = 



Y S(co-n2Tc!At)e lcot d<jo 

n=~ <*> 



ilnjit/At 


( 8 ) 


which is just At/27t times the Fourier series for V(t; At) (Eqn 6). 
Thus the transform of a Dirac comb spaced at At in time is 
(27t/At)V(co; 2ntAt), a comb spaced 2n!At in angular frequency 
with an amplitude of 2k! At (Fig. 6.4-1). 

The effects of sampling the signal x(t) at times At can be 
found by writing the sampled signal x(t) as the product of the 
signal and the Dirac comb in time, 


x(t) = x(t)V (t; At). 


(9) 


Because multiplication in the time domain corresponds to con¬ 
volution in the frequency domain, the transform of the sampled 
signal, X(co), can be written as 


X(co)=X(co) * (2n/At) V(co;2K/At). 


( 10 ) 


Flence X(co) is convolved with the Dirac comb, causing the 
spectrum of the sampled signal X(co) to be periodic in angular 
frequency with period (2k/A t). 

To see what this does, suppose that the signal x(t) is band 
limited such that its spectrum X(co) is zero outside the principal 
angular frequency band -K/At < co< K/At, the range between 
the first delta functions on either side of the origin (Fig. 6.4- 
2a). Thus, after sampling, the adjacent X(co) do not overlap 
(Fig. 6.4-2b), and the spectrum of the sampled time series is 



Fig. 6.4-2 Effect of sampling on the frequency amplitude spectrum. 

The spectrum of the unsampled signal (a) is convolved with a Dirac comb, 
making the spectrum of the sampled signal periodic in angular frequency 
with period (2n/At). If the spectrum of the unsampled signal is zero outside 
the principal angular frequency band -k!A t <co< K/At, the range between 
the first delta functions on either side of the origin, the spectrum of the 
sampled signal is the same as that of the original signal in this frequency 
range (b). Otherwise the spectra overlap after convolution (c), a 
phenomenon called aliasing that makes the sampled spectrum inaccurate. 


the same as that of the original time series in the principal 
frequency range. 

On the other hand, if X(co) is not limited to this range, the 
spectra overlap after sampling, so that two adjacent spectra 
both contribute at these frequencies (Fig. 6.4-2c). The effect 
of the periodicity is that for angular frequencies | co | > K/At, or 
frequencies \f\> 1/(2A t), the spectrum is inaccurate, because 
the overlap area is folded into the principal frequency range. 
This phenomenon, called aliasing, can be avoided by sampling 
the signal sufficiently densely that the spectra do not overlap. 
This requires that the sampling interval At be such that the 
corresponding frequency, known as the Nyquist frequency, 

f N =l/(2At) or co N = K/At, (11) 

is higher than the highest-frequency component of the signal, 
so that the spectrum is correctly resolved. The shorter the 
sampling interval, the higher the Nyquist frequency, the larger 
the interval over which the spectrum is periodic, and thus the 
higher the frequency below which the spectrum is correctly re¬ 
solved. In practice, it is desirable to sample even more densely, 
perhaps four or more times, than the Nyquist criterion. As 
we sample more densely, the sampled signal becomes a better 
representation of the signal, and its spectrum becomes a better 
representation of the true spectrum. 
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Fig. 6.4-3 In the time domain, aliasing can be viewed by noting that at least two samples per wavelength are needed to reconstruct a sinusoid accurately. 
Any higher frequencies are aliased into lower ones. In this case, sampling a sine wave at a sampling interval of four-fifths of the period of the wave results 
in an aliased signal with a period that is four times greater. 


Another way to see these ideas is to note that at least two 
samples per wavelength are needed to reconstruct a sinusoid 
accurately. Any higher frequencies are aliased into lower ones 
(Fig. 6.4-3J. 1 Aliasing occurs when the data are sampled, and 
once this occurs, the data cannot be “unaliased.” As a result, 
seismic data are filtered with an analog anti-aliasing filter to 
remove frequencies above the Nyquist frequency before sam¬ 
pling to produce a digital seismogram. 

6.4.2 The discrete Fourier transform 

We now consider the Fourier transform of a sampled time 
series. If the function f(t) is sampled at N time points that are 
At apart, the function can be represented as 

f{t)=f{nAt) for« = 0,1, . . ., N- 1. (12) 

To make subsequent derivations easier, we require N to be an 
even number. The Fourier transform integral, 

F(a) = ( f(t)e- i(0t dt, (13) 


with 

Aco=2(D N IN=2nlNAt-27tlT , (16) 

where T= NAt is the total length of the data in time, sometimes 
called the record length. This sampled Fourier transform of a 
sampled time series is called the Discrete Fourier Transform 
(DFT): 

N-l N-l 

F(kAw)=At X f(nAt)e~ ikA(0nAt = At X f{nAt)e~ iknlKlN . (17) 

«=0 n =0 

The DFT gives values at angular frequencies 

0, Act), 2Aco,. .. {N/2)Aw,. . . (N-l)Afl). (18) 

The second half of the values represent angular frequen¬ 
cies greater than {N/2)Aw, which equals the Nyquist angular 
frequency. These points correspond to the negative angular 
frequencies, wrapped around to follow the positive angular fre¬ 
quencies. For example, the first point after the Nyquist angular 
frequency occurs for angular frequency 


can be written as a summation: 

N-l 

F(cd) = At X f{nAt)e~ mnAt . (14) 

n =0 

This transform is a continuous function of co that we 
approximate using its values at discrete frequency points. 
Because sampling produces a spectrum that is periodic in angu¬ 
lar frequency with period 2k! At, or twice the Nyquist angular 
frequency w N , we divide this interval into N points as 

F{co) = F{kAco) for k = 0, 1,. .. ,N- 1, (15) 


(N7 2 + 1 )Aco = {N/2)Aco+Aw=w N +Aw 

(n 1 

=-w N +Aw= -1 Aw, (19) 

v 2 j 

where we use the fact that the spectrum is periodic with period 
2w N . Each successive point corresponds to an increment of 
-Aw. Thus, we can consider the DFT to give values at angular 
frequencies 


0, Aw, 2Aw,. . 
-2Aw, -Aw. 


N 

2 


Aw, w N , - — - 1 Aw,. . ., 


( 20 ) 


1 An illustration of sampling issues is that in Western films, wagon wheels some¬ 
times appear to rotate backwards, stop, or rotate only slowly forward. These effects 
result from differences between the wheels’ rotation rate and the movie cameras’ 
sampling rate, typically 24 frames per second. 


Graphically, we can think of folding the second half of the DFT 
about zero frequency to give the values of the spectrum at the 
negative frequencies (Fig. 6.4-4). 
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Continuous frequency amplitude spectrum 




= &N 


Fig. 6.4-4 Due to the periodicity of the 
discrete Fourier transform, the second half 
of the values of the frequency amplitude 
spectrum, at angular frequencies greater 
than the Nyquist angular frequency 
(N/2)Aft), represents the negative angular 
frequencies. 


The fact that the DFT is the sampled spectrum of a sampled 
time series has two interesting consequences. The highest 
angular frequency that can be resolved is the Nyquist, which 
depends inversely on the sampling rate in time, because (0 N = 
nl(At). On the other hand, the resolution in frequency, given 
by the spacing between successive angular frequency points, 
Aco=2n/(NAt), depends inversely on T = NAt, the total record 
length. 

For example, to resolve the singlets making up the normal 
mode multiplet 0 S 2 (Fig. 2.9-16), we would like a frequency 
resolution of at least 0.0001 cycles/minute, or 1.7 x 10~ 6 s“ a . 
This requires data extending for 1/1.7 x 10~ 6 s, or more than 
160 hours, after the earthquake. However, because the mode’s 
period is 54 minutes, a seismogram sampled every few minutes 
would be adequate and give a manageable number of data 
points. We need, however, to prevent aliasing due to surface 
and body waves that have periods of tens to hundreds of 
seconds. An easy way to do this would be to start the analysis 
a day or so after the earthquake, when the shorter-period 
waves have decayed due to attenuation. This approach uses the 
earth’s anelasticity as a natural anti-aliasing filter. By contrast, 
reflection seismology requires high temporal resolution to 
resolve closely spaced interfaces, so reflection data are sampled 
at high rates such as 250 times per second after an anti-aliasing 
filter is applied. 


By analogy to the DFT, we write the inverse DFT (IDFT) by 
approximating the inverse Fourier transform integral 


f(t) = 



F(co)e mt dco 


( 21 ) 


in the same way, which gives 
1 N_1 

f(nAt)= — ^ F(kAco)e t ^ (0 ^ nd,:t ' ) Aco 
In k=0 

a N_1 

= — X F(kAd)e iknlnlN 
2 K k=0 

N-1 

=-Y F(kAco)e iknl7ltN . (22) 

NAt^o 

An interesting feature of the IDFT comes from the fact that it 
samples the spectrum at discrete frequencies Aco. Sampling the 
time series at At causes the phenomenon of aliasing, because the 
spectrum is periodic in angular frequency with period In!(At). 
By analogy, sampling the frequency spectrum at Aco makes the 
time series periodic with a period of 


^AAA 
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— = —^- = (NAt) = T , (23) 

A® 2k I (NAt) 

which is equal to the original record length. 2 This wraparound 
phenomenon can be important, as we shall see when discussing 
the use of DFTs to carry out convolutions. 

6.43 Properties of DFTs 

For simplicity, we write the DFT and the inverse DFT implicitly 
assuming a unit sampling interval, At = 1, and define 

N-l 

F(k) = F{kAco) = ^ f(n)e~ 2mknlN 

n= 0 

for k and n = 0,1,. . ., N- 1 (24) 

1 N_1 

f(n)^f(nAt) = --^F(k)e 1MN 

N k=0 

for k and« = 0,1,... , N- 1. (25) 

The two equations are very similar in form and are easy to 
evaluate — the forward and inverse transforms differ only 
in the sign of the exponential and the 1/N normalization. 
This is especially clear if we define the complex exponential as 
W = g~ 2jn/N 5 so the definitions of the DFT and ID FT become 

N-l 1 N-l 

F(k)=^f(n)W kn and f(n) = — J,F(k)W~ kn . (26) 

n =0 N k-0 

The terms with the complex exponential are periodic in N, 

yjkn _ w(N+k)n _ ypk(*t+n), (27) 

so the DFT and IDFT can be defined for all integers k, n, j as 

f(n)=f(jN + n), F(k) = F(jN + k). (28) 

A formal statement of the relation between the negative and 
positive frequencies can also be given as 

f(-n) = f(N-n ), F(-k) = F(N-k). (29) 

We used this relation when we explained how the second half 
of the DFT corresponds to negative frequencies (Fig. 6.4-4). 

Using these definitions, we can show that the discrete trans¬ 
forms have properties that we discussed for the continuous 
transforms in Section 6.2.4: 3 

(1) The DFT and IDFT are linear: if A{k) and B{k) are the 
transforms of time series a(n) and b{n ), then aA(k) + (3B{k) is 

2 Because of this periodicity, the record length is considered to be NAt rather than 
(N-l) At. 

3 As for the continuous transforms, the proofs are left for the problems. 


the transform of aa(n) + /3 b(n). Thus we can use the discrete 
transforms to model linear systems. 

(2) The DFT of a real time series (i.e., one for which f(n) = 
f'\n)) has the symmetry 

F(-k) = F(N-k) = F*(k). (30) 

Thus, as with the continuous transform, the values for the 
negative frequencies are the conjugates of those for the positive 
frequencies. 

(3) Shifting a time series in time simply changes the phase of 
the DFT: if the transform of f(n) is F(k), the DFT of f(n - j) is 
W k iF{k). Similarly, shifting a Fourier transform in frequency 
changes the phase of the IDFT: the inverse transform of 
F{k-m) is W~ mn f{n). 

6.4.4 The fast Fourier transform (FFT) 

For these concepts to be useful, the transforms and inverse 
transforms must be evaluated on a computer. Moreover, it only 
makes sense to carry out filtering using Fourier transforms if 
the transform and inverse transform operations are relatively 
quick. It turns out that an elegant algorithm known as the Fast 
Fourier Transform (FFT) provides a fast way of carrying out 
the DFT and IDFT. 

The time a computer needs to carry out an algorithm 
depends on how many arithmetic operations are needed. We 
would expect that evaluating all N points in the DFT, each 
of which is the sum of the N terms in the series, would require 
approximately N 2 operations. The FFT algorithm, however, 
requires a much smaller number of operations, approxim¬ 
ately N log 2 N. The difference is substantial; for N = 4096, 
N 2 = 16,777,216, but N log 2 N = 49,152 - about 340 times 
fewer! As a consequence, the introduction of the FFT made 
digital signal processing common in seismology and many 
other disciplines. 

Entire books have been written about the FFT, so we only 
briefly sketch the approach here. The underlying idea is that a 
simple method can be used to compute the transform of a series 
of points by splitting it in half. We take a series with N points, 

f(n) for n = 0,1,. . ., N- 1 (31) 

and form two subseries, one with the odd-numbered points and 
one with the even-numbered points: 

a{n) = (f(0),f(2),f(4),...)=f(2n) 
for n- 0,1,... , N/2 -1, 

b(n) = (f(l),f(3),f(5f...)=f(2n+l). (32) 

The DFTs of the two subseries are 
N/2-1 

A(k)= X a(n)e- 4nikn/N and 

n =0 
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N/2-1 

B(k)= £ b(n)e^ %iknlN , (33) 

n= 0 

where k goes from 0 to Nil - 1, and the factor of 4 comes from 
the fact that the subseries lengths are N!2. 

The DFT of the original series can be written in terms of the 
DFTs of the subseries, 

N-l 

F(k)= ^f(n)e~ 2aiknlN 

n= 0 

N/2-1 

= [a fo^'jg-2mk(2n+l)/N j 

n =0 

= A(k) + e~ 2KikfN B(k) for k = 0,1,. .., N/2 - 1, (34) 

giving the first N/2 points of F(k). The second N/2 points come 
from replacing k by k + N/2, 

F(& + N/2) = A(& + N/2) + e-2«(*+N/2)/N B(k + N/ 2), (35) 

and noting that, because the DFTs of the subseries are periodic 
with a period equal to their length, N/2, 

A(& + N/2) =A(k) and B(k + N/2) = B{k). (36) 

Because the exponential can be written as 

e ~2m(k+N/2)/N _ e ~7ti e ~2mk/N _ _ e ~2nik/N {S7) 

the second half of the transform can be found from the first, 
using 

F(k + N/2) = A(k) - e~ 2 * ik/N B(k). (38) 

In terms of W = e~ 2mlN , the expressions for the two parts of the 
transform (Eqns 34 and 38) have the simple form of 

F(k)=A(k) + W k B(k) and F(k + N/2) = A(k)-W k B(k). 

(39) 

This method is called doubling — finding the transform of 
an N-point series from the transforms of its two N/2-point 
subseries. Doubling can be applied recursively, because we 
can find the transform of each N/2-point series from that of two 
N/4-point series, etc. Ultimately, a series of length N = 2 n can be 
evaluated via n = log 2 N such stages. In the final stage, the trans¬ 
form of each 2-point series is found from two 1-point series, 
but the transform of a 1-point series is itself. Various methods 
can be used to further speed up operations. 

Thus, to obtain the FFT of a time series, we treat the data 
points as N 1-point series, use doubling to form (N/2) 2-point 
series, and so on until the final N-point transform. The same 
FFT algorithm can also be used to take the inverse transform. 


Commonly, the same computer program is used for both for¬ 
ward and inverse FFTs, except that the sign of the exponential 
must be changed and the 1/N normalization remembered (the 
last being a traditional bane of students). 

In using the FFT to transform data as part of a filtering 
operation, the factor of 1/N may be included at any step in the 
process. Often, however, we use the FFT to obtain the Fourier 
transform of a time series, and compare this to a result derived 
in the frequency domain, such as an analytic expression for a 
synthetic seismogram as a function of CD. In this case, we have to 
consider the units of both the forward and the inverse DFT. 
The forward DFT is an approximate way of evaluating the 
Fourier transform integral (Eqn 13), in which the differential 
dt is replaced by the difference At. Thus, the FFT results are 
multiplied by At. Similarly, the IDFT approximates the inverse 
transform integral (Eqn 21), with the differential dco replaced 
by the difference Aco. Hence the results from inverting the FFT 
are multiplied by A(dI{2k). The product of these two factors is 
A(oAtl2n= 1/N, as expected. 

This discussion assumes that the series length N is a power of 
2. If this is not the case, a number of zeroes necessary to obtain 
a power of 2 can be added to the end of the time series. Such 
zero padding has the effect of sampling the spectrum more 
densely, because the sample interval is unchanged, but the 
frequency interval Aw = 27rl{NAt) decreases. Despite the denser 
sampling, the real resolution in frequency is not increased 
beyond that resulting from the real (nonzero) data length. 
Instead, smooth interpolation is done within the range of ac- 
tual resolution Affl real = 2 7c/T nonxlo . 

Finally, it is worth distinguishing between the DFT and the 
FFT. The DFT is the discrete approximation to the Fourier 
transform which has the periodic properties we have discussed. 
The FFT is a clever method for computing the DFT with many 
fewer operations. 

6,4.5 Digital convolution 

As discussed in Section 6.3.2, the convolution is used in many 
seismological applications. This operation has some special 
features when carried out with discretized time series and their 
transforms. 

Given two discrete time series with unit sample period, 
x{m) with M points x(0), x{l), ..., x(M - 1) and f(n) with 
N points f( 0 ), /(l), . .. , f(N - 1), the convolution in the 
time domain is written, by analogy to the integral definition, as 

M~ 1 

y(t) = x(t)*f(t) = Y, x ( m )f(t-rn). (40) 

m =0 

We evaluate the summation for each value of t that yields a 
nonzero value. Because f(n) is zero for n outside the range (0, 
N-l) an dx(m) is zero form outside the range (0, M- 1), there 
are N + M - 1 terms in the convolution, and y(t) is defined for 
t = 0, 1,.. ., N + M - 2. For example, if N = 3 and M = 4, the 
3 + 4 - 1 = 6 terms are 
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Because x(m) and f(n) have different lengths, the points in 
the two transforms would correspond to different angular 
frequencies. To avoid this, the two time series are extended 
with zeroes at their ends, so that their lengths equal the same 
power of 2. 

A further point to bear in mind is that the time series corres¬ 
ponding to the convolution is longer than either of the two 
series that are convolved. If the number of points in the DFT 
is less than this length, a wraparound phenomenon similar 
to aliasing occurs when we invert the transform, due to the 
periodicity resulting from the sampled transform. The two time 
series thus need to be extended to a length at least that of their 
convolution before their DFTs are taken. 


y(4) = x(2)f{2) + x(3)f(1) 


_ x(3) x(2) x(1)jx(0) 

m m t( 2) 
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y(5)=x(3)f(2) 


x(3) x(2) I x(l) | x(0) 

m I m f(2) 


Fig. 6.4-5 Schematic diagram of a time domain convolution of two 
sampled time series as a reverse, multiply, and slide operation. 


y(0)=x{0)f(0) 
y(l)=x(0)f(l) + x(l)f(0) 
y(2)=x(0)f(2) + x(l)f(l)+x(2)f(0) 
y(3)=x(l)f(2)+x(2)f(l)+x(3)f(0) 
y(4)=x(2)f(2)+x(3)f(l) 

y(5)=x(3)f(2). ( 41 ) 

We can think of this operation as reversing the order of x{m) 
and sliding it past f{n ), while conducting all nonzero multiplica¬ 
tions (Fig. 6.4-5). 

These formulations show that the convolution has more 
terms than either of the time series being convolved. This has 
some interesting consequences if we do the convolution in the 
frequency domain. Because the data are sampled at discrete 
intervals, convolution in the frequency domain requires taking 
two discrete Fourier transforms, multiplying them, and then 
taking the inverse discrete Fourier transform. If Y(k ), X{k), and 
F(k ) are the DFTs of y{t), x{m), and f(k), then 

Y(k) = X(k)F(k) (42) 

gives the complex spectrum at each angular frequency. This 
brings out an important point; all the DFTs must be defined 
at the same frequencies. For a time series of length N with 
unit sample period {At = 1), the angular frequencies in the DFT 
are 

(43) 


Seismology uses data to estimate quantities that describe the 
earth and seismic sources. Ideally these estimates are both 
accurate and precise. Accuracy measures the deviation of the 
estimate from its true value, whereas precision measures 
the repeatability of individual estimates. Hence the accuracy 
depends on systematic errors that bias groups of estimates, 
whereas the precision depends on random errors that affect 
individual estimates. Estimates can be precise but inaccurate, 
or accurate but imprecise. For example, an estimate of an 
earthquake’s location depends on the quality of the travel time 
data used and the accuracy of the velocity model. High-quality 
travel time data, together with an incorrect velocity model, can 
yield a location that is precise in that the data are well fit and 
so imply small uncertainty, but inaccurate in that the resulting 
location is not where the earthquake occurred. In such a case 
the true uncertainty exceeds the formal uncertainty inferred 
from how well the model fits the data. Conversely, an accurate 
velocity model and poor travel time data can give a location 
that is accurate in that it is close to where the earthquake 
occurred, but imprecise in that the location has a large uncer¬ 
tainty and there are large misfits to the data. 

Approaches to improving the accuracy and precision of 
estimates are often couched in terms of measuring a quantity 
like the length of a table. Accuracy is improved by using dif¬ 
ferent measuring tools, ideally calibrated against each other. 
Precision is improved by making multiple measurements, 
ideally by different people. We follow such approaches for the 
earth when possible, but face additional complexities. For 
example, an earthquake is a nonrepeatable experiment, so we 
cannot make additional measurements. We can use different 
techniques, but still face difficulties. A case in point is that 
estimates of an earthquake’s depth from travel times and 
waveform modeling are only partially independent. Both can 
be biased similarly by incorrect assumptions about the near¬ 
source velocity, but the travel times are independent of the 
assumed source mechanism, and the waveform modeling 
(which depends on relative arrival times) would not be biased 
by an error in the absolute timing of individual seismograms. 


kAm-klnlN for& = 0,1,... , N- 1. 
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A further complexity is that different methods can measure 
related but not identical entities: the earthquake depth ranges 
inferred from travel times, waveform modeling, aftershock 
locations, and geodesy differ somewhat, because each meas¬ 
ures related but not identical quantities. 

Most discussions of these issues focus on random errors 
because they are easy to estimate from the scatter of measure¬ 
ments. However, it is worth bearing in mind that systematic 
errors not included in these error estimates can be more 
significant, as discussed in Section 1.1.2. Systematic errors can 
come about in surprising ways and have subtle and crucial 
effects. For example, we have noted that velocity hetero¬ 
geneities can perturb ray paths and thus bias earthquake focal 
mechanisms (Section 3.7.3); attenuation variations can bias 
estimates of the yields of nuclear explosions (Section 1.2.8); 
errors in the paleomagnetic time scale can bias estimates of 
plate motions (Section 5.2.2); and effects including an unde¬ 
tected earthquake can change estimates of earthquake recur¬ 
rence from paleoseismology (Section 1.2.5). Systematic biases 
are difficult to detect, but sometimes are identified from dis¬ 
crepancies between different approaches. For example, the 
discrepancy between earth models derived from body waves 
and those from normal modes suggests physical dispersion due 
to anelasticity (Section 3.7.8), and the discrepancy between 
oceanic Love and Rayleigh wave velocities points toward 
anisotropy (Section 3.6.5). Hence, when data are discordant, 
as in the differences in earthquake frequency-magnitude 
relations derived from seismological and paleoseismic data 
(Section 4.7.1), systematic bias is one possible cause. 

In this section, we develop some general ideas about errors 
and consider some examples. Our focus is one of the most 
useful methods for improving estimates from seismological 
data: stacking , or taking multiple measurements and averaging 
them. We do this either by averaging measurements such as 
travel times from different seismograms, or by adding many 
seismograms and then estimating parameters. This process has 
two effects. First, it improves precision by reducing the effects 
of random noise in the data. Second, if the data are averaged 
in specific ways, the precision, and perhaps accuracy, can be 
improved by suppressing some features of the data and thus 
enhancing desired features. 


Gaussian distribution 



Fig. 6.5-1 Probability density function for a Gaussian distribution with 
mean fi and standard deviation a. Ranges within one and two standard 
deviations of the mean are shown by vertical lines. 


a certain value. For example, in Section 4.7.3 we treated the 
occurrence of earthquakes as samples from a parent distribu¬ 
tion of recurrence times. That example illustrated that in most 
applications it is not clear what the most suitable parent distribu¬ 
tion is. It is common to assume that the parent distribution is a 
Gaussian distribution, also called the “normal distribution,” 
because it often describes the frequencies at which very differ¬ 
ent phenomena occur. A famous result called the central limit 
theorem shows that this is because a sum of random numbers 
approaches a Gaussian distribution even if the random 
numbers are derived from other probability distributions. 

For a Gaussian distribution, the probability that the i th 
measurement would yield a value in the interval x i ± dx , in the 
limit as dx —> 0, is 


p(x { ) - 



exp 



( \ 

2 

1 

x t -li 


2 

O 



\ J 



( 1 ) 


The distribution is thus characterized by two parameters: the 
mean, /x, and the standard deviation, cr. The most probable 
measurement is the mean value, and values on either side of it 
are less likely the further from the mean they are. The distribu¬ 
tion is often written as a function of the normalized variable 
z = {x-p)lo , 


6.5.1 Random errors 

We seek to estimate a quantity x from multiple measurements, 
each of which gives a value x • due to noise and the limitations 
of the measurements. With enough measurements, a pattern 
generally emerges in which the values x i are distributed about a 
value x'. If we neglect systematic errors of measurement, we 
can estimate the value of x from the measured values x • and say 
something about how this estimate is related to the unknown 
true value of x. 

For this purpose we view the measured values x i as random 
samples from a parent distribution described by the probability 
density function p(x) that gives the probability of observing 


p(z) = 



exp [-z 2 / 2]. 


( 2 ) 


Figure 6.5-1 shows the familiar “bell curve” that results. 

A common application is to estimate how likely a measure¬ 
ment is to be within a range z from the mean. To do this, we 
integrate the probability density function to find the cumulative 
probability 


A(z) = 


p{y)dy = 


-Z 


Z 



exp [~y 2 l2]dy. 


(3) 
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For z = 1, we get A(z) = 0.68, indicating that there is a 68% 
probability that a measurement will be within one standard 
deviation of the mean. Similarly, A{ 2) = 0.95 and A(3) = 0.997, 
indicating a 95% probability that a measurement will be 
within two standard deviations of the mean, and a greater 
than 99% probability that it will be within three standard 
deviations. We used such ideas in estimating earthquake prob¬ 
abilities (Section 4.7.3). 

We expect that if we made an infinite number of measure¬ 
ments (samples) without any systematic biases, a histogram 
of the measurements would look like the parent distribution. 
The mean of the observed values will be the mean of the 
distribution 


U = lim 


and the spread of the measurements is the variance (standard 
deviation squared) of the distribution, 


0 = A_iy%zAl = _IyA[^ 

d/i' 2 "I I 2 d/i' a 


wmcn occurs 


XI*,-/*■].o, 


“si’" 


This is not surprising — the average value of x i is the best 
estimate of the mean. An interesting question is what is the 
standard deviation <j n of this estimate of //? Specifically, how 
does the uncertainty associated with this estimate compare to 
the uncertainty of each individual measurement? 

To answer this, we use the propagation of errors, a general 
method for finding the relation between the uncertainty in a 
function and the uncertainty in the variables that it depends on. 
If 2 is a function of multiple variables, then 


Thus, if the assumptions we have made are valid, the mean of 
a large number of measurements, p, would be the value that 
we seek. 

The difficulty in reality is that only a limited number of meas¬ 
urements are available to estimate p. As a result, the actual 
mean p' is not necessarily equal to p. We thus ask what method 
of deriving p' from the measurements gives the maximum like¬ 
lihood that p' is actually the mean of the parent distribution. 

To find this, we assume that the parent distribution had 
mean p' and standard deviation <7, so the probability that the 
i th measurement would yield a value in the interval x i ± dx in 
the limit as dx —> 0 is 


T== exp —— 


z = f(u,v,. . .), (11) 

and we have N measurements of («, v, . . . ). The mean value of 
the function is its value for the mean of the arguments, 

z = f(u,v ,...), (12) 


and its variance is 


G 2 Z = lim 

N 

If we expand z in a Taylor series about its mean value, 


- / Bz , _. Bz 

z i -z={u i -u) — +(*/,— i/)—- + . . •, 
au av 


For N observations, the probability of observing a particular 
set of values x- is the product of the probabilities that each 
individual measurement would have that particular value, 


pm =nw 


cta/2^ 


ex p 

Z i =1 


Xj - P 


The most probable value of p' is the one that maximizes p(/i / ), 
the probability of obtaining the set of measurements actually 
found. To find this value, we set the derivative of the argument 
of the exponential equal to zero, 


? 1 / -v Bz , Bz 

O z z = lim — 2, “ « — + ~ v)zr- + 

z u “ Bu Bv 


— lim —V (u: 
n->~ NS ' 


+ (V: ~ Vf 
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+ 2 (u i -u) — (v i -v)— + ... . 
au av 


( 15 ) 
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To simplify this expression, we use the variances of each vari¬ 
able about its mean 


o 2 =lim and ol=Y\m-'^{v i -v) 1 (16) 


N^o o N 




and the covariances that describe how fluctuations between 
variables are correlated: 


°lv = * im — X 

N {=1 

Substituting Eqns 16 and 17 into Eqn 15 gives 


2 21^1 21^1 ^ j dz If dz ,. 0 , 

= — +... + 2al — — (18) 

dw df dw d^ 


This relation, called the propagation of errors equation, illus¬ 
trates that the extent to which the uncertainty in each variable 
contributes to the uncertainty in a function depends on the 
partial derivative of the function with respect to that variable. 
We often assume that the variations in the different variables 
are uncorrelated (which is not always the case), so we set the 
covariances equal to zero, and simplify the variance of z to 


dz I o dz 
— + erf — 


This result is a general one that we have already mentioned 
in the context of estimating the uncertainty of geodetic rates 
(Eqn 4.5.8) and earthquake source parameters (Eqn 4.6.23). 

In the specific application here, we consider the mean to be a 
function of the observations. 


Z = Ai ' = N? X( ’ 


so the error propagation equation can be used with («, v ,. . . ) 
= x-. Assuming that the variables are independent, so their 
errors are uncorrelated, we get 


i 3 N 1 i N 

■ 77 - 7 —2*. = 177 L a x, 

N dx t , =1 N 2 “i 


If all the observations have equal uncertainties (<7 2 = a 2 ), then 
<r*, = a 2 /N. (22) 

Thus the variance of the mean is 1 IN times the variance of 
the individual measurements. Hence making N measurements 
reduces the standard deviation of the mean by 1A/N. This is the 
basic idea behind stacking; averaging multiple measurements 



Fig. 6.5-2 Results of drawing N samples from a Gaussian parent 
distribution with mean zero and a unit standard deviation. For small 
numbers of samples, the observed distribution can look quite different 
from the parent distribution, and the sample mean ^'differs from that of 
the parent distribution. As the number of samples increases, the observed 
distribution looks increasingly like the parent distribution. 


of some quantity yields an estimate that has a smaller uncer¬ 
tainty than the individual measurements. 

Figure 6.5-2 illustrates this idea. We assume that measure¬ 
ments of some quantity are described by a Gaussian parent 
distribution with a mean of zero, and we try to estimate this 
quantity with different numbers of samples. As the number of 
samples increases, the distribution of samples looks increas¬ 
ingly like the parent distribution, and the sample mean ap¬ 
proaches the mean of the parent distribution. However, for a 
small number of samples, the observed distribution can look 
quite different from the parent distribution. This issue arises in 
studying earthquake recurrence, where the few samples avail¬ 
able make it difficult to assess whether apparent differences in 
earthquake history (Section 4.7.1) are significant and what 
parent distributions and parameters should be used to estimate 
earthquake probabilities (Section 4.7.3). 
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This simple Gaussian model is widely used in analyzing data. 
We assume that each measurement includes the quantity of 
interest and some noise , defined as the portion of the signal that 
is not of interest. The noise thus reflects both true errors 
of measurement and processes not under consideration, all of 
which are assumed to be uncorrelated between measurements. 
To the extent that these assumptions are valid, stacking data 
will improve the signal. The random, uncorrelated noise idea 
often seems to be a good approximation. However, if noise is 
correlated between measurements, as can occur if the measure¬ 
ment equipment is biased or an “error” source is otherwise 
common to the measurements, the desired noise reduction 
will be less. For instance, the structure under a seismometer is 
studied by means of receiver functions that are derived using 
the radial and vertical components (Fig. 6.3-7), assuming that 
the noise on each is uncorrelated. However, noise due to 
microseismic activity (Section 6.6.3) will be correlated between 
components and hence can yield spurious layering. 

6.5.2 Stacking examples 

A simple stacking approach is to add seismograms at nearby 
stations, assuming that they contain a common signal of interest 
plus “noise” that differs between stations. The noise includes 
differences in the response of the seismometers and differences 
in the seismograms generated by the interaction between the 
upcoming waves and the crustal structure under each seismo¬ 
meter. If the seismometers and crustal structure are similar 
enough, stacking seismograms should reduce the noise and 
yield a better representation of the signal of interest than the 
individual seismograms. 

An extension of this idea is used for seismograms at different 
places or times. If we know theoretically how the signal of 
interest varies as a function of position or time, we can correct 
the data to a common position or time and stack them. For 
example, in CMP stacking of reflection seismic data, traces 
with a common midpoint are shifted by a time corresponding 
to the travel time curve of a reflection and then stacked (Section 
3.3.4). The reflected arrivals are in phase and thus enhanced, 
whereas other arrivals with different travel time curves are out 
of phase and thus suppressed. Although the undesired arrivals 
are not random noise, they are reduced relative to the reflected 
arrivals. Random noise in the data is also reduced. 

This approach is also useful in observing deeper earth struc¬ 
tures, such as mantle discontinuities (Section 3.5.3). Figure 6.5- 
3 shows an example of stacking large numbers of long-period 
transverse-component seismograms to enhance precursors to 
the SS arrivals. The precursors, S 410 S, S 52 qS, and S 660 S, are 
underside reflections from the discontinuities at 410, 520, 
and 660 km depths. However, these phases are weak and 
are not easily observed above the noise on individual seismo¬ 
grams. Stacking many records enhances these arrivals, allow¬ 
ing the depths of the discontinuities to be studied. Moreover, 
after removal of the theoretical signals of S 410 S and S 660 S 
(Fig. 6.5-3, middle), the stacked record shows the S 520 S arrival 




55 Precursor Stack 



Fit to 5 410 5 and 5 660 5 




Fig. 6.5-3 Stacking long-period seismograms to identify the depth of 
mantle discontinuities by enhancing precursors to SS. The initial stack 
(top) shows the 5 410 S and S 660 S underside reflections off the 410 km and 
660 km discontinuities, magnified by a factor of 10. A theoretical signal 
generated from the SS wave (center) is subtracted from the observed stack 
to reveal the reflection from the 520 km discontinuity (bottom). (Shearer, 
1 996. J. Geophys. Res., 101,3053-66, copyright by the American 
Geophysical Union.) 
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Slant stack of April 3, 1985, Bonin earthquake 
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Fig. 6.5»4 Slant stack of seismograms 
at 279 stations for a deep (476 km) 
earthquake. The bull’s-eyes are 
concentrations of seismic energy for 
particular arrivals. (Vidale and Benz, 1992. 
Reproduced with permission from Nature.) 


(Fig. 6.5-3, bottom ), which is weak due to the gradual velocity 
change at the 520 km discontinuity, and so rarely observed 
otherwise. 

Mantle structures can also be observed with slant stacks 
(Section 3.3.5). The seismograms are stacked as functions of 
both time and slowness, so instead of getting a single seismo¬ 
gram, as in Fig. 6.5-3, we get a plot of seismic energy as a func¬ 
tion of time and slowness. As shown in Fig. 6.5-4, arrivals 
occur as high-amplitude bull’s-eyes. The P and pP arrivals have 
a slightly different slowness due to the small (about 1°) dif¬ 
ference in incidence angles. The large arrivals create smeared 
features that are artifacts of the slant stacking. 

Stacking is also used to enhance specific normal modes of the 
earth. The amplitudes of normal modes vary between stations, 
because they depend on spherical harmonics that are func¬ 
tions of latitude and longitude, which differ between individual 
modes (Section 2.9.3). Although simply stacking seismograms 
from different sites does not make spectral peaks stand out 
better, correcting for the theoretical variation in amplitude and 
phase for a given mode and then stacking enhances the mode of 
interest and suppresses others (Fig. 6.5-5). 

Stacking can be applied to very large volumes of data. Fig¬ 
ure 6.5-6 shows record sections generated with thousands 
of digitally recorded seismograms from different earthquakes 
and seismometers. The seismograms were rotated into vertical, 
radial, and transverse components, grouped by source-receiver 
distance, and then those within half-degree intervals were 
normalized to a common amplitude and stacked. The strong 
arrivals in the stacked record sections correspond to the major 
phases shown in the travel time curves. It is interesting to com¬ 
pare this analysis of global seismic data spanning large distance 
ranges with reflection seismic data analysis (Section 3.3.4). 
For reflection data, CMP stacking involves forming common 



Fig. 6.5-5 Stacking long-period seismograms to enhance specific normal 
modes of the earth. Although a given mode multiplet is not enhanced by 
simply stacking seismograms from different sites (top), stacking using its 
predicted variation between sites enhances the multiplet and suppresses 
others {lower panels). (Mendiguren, 1973. Science, 179 ,179-80, 
copyright 1973 American Association for the Advancement of Science.) 
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Fig. 6.5-6 Stacking of global seismograms 
to produce record sections. The three 
stacks, each for a different component, 
show distinct arrivals that can be 
compared to those predicted by the travel 
time curve for an earth model. (Astiz etal., 
1996. © Seismological Society of America. 
All rights reserved.) 
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midpoint gathers and stacking them over all source-receiver 
distances (offsets) (Fig. 3.3-18), to produce synthetic zero- 
offset traces on which reflected arrivals are enhanced. These 
traces are then shown together to produce a common mid¬ 
point section, a function of midpoint and time. By contrast, 
the global data are gathered by common offset, stacked for 
that offset, and then displayed as a function of offset and 
time. This operation only reduces noise, rather than enhanc¬ 
ing specific arrivals, and so shows various arrivals (direct 
waves, reflections, surface waves, etc.). Another example was 
shown in Fig. 2.7.4, where many long-period seismograms 
were stacked to demonstrate the group and phase velocities of 
surface waves. 


In these or other stacking operations, one possible source of 
systematic error is incorrect transformation of the data between 
different times or positions. Interestingly, in the very different 
cases just discussed, a common difficulty is lateral variation in 
structure. In the reflection example, structures may dip rather 
than be flat-lying, causing traces with common midpoints not to 
sample the same point on a reflector (Fig. 3.3-19). In the global 
travel time analysis, seismograms for the same source-receiver 
distance differ when the structure between the source and the 
receiver differs. An analogous effect occurs for normal modes 
due to deviations of the structure from spherical symmetry. 
Nevertheless, because in most cases structure varies primarily 
with depth, these stacking operations generally work well. 
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6.6 Seismometers and seismological networks 

6.6.1 Introduction 

Given what we have discussed about signal processing, we now 
introduce some ideas about seismometry , the design and devel¬ 
opment of seismic instrumentation. Although we informally 
call such systems seismometers, the seismometer is actually the 
sensor recording ground motion, and thus a key component of 
the entire seismograph system, which also contains amplifying, 
timing, and recording components. The product, a record of 
ground motion as a function of time, is a seismogram. 

Following linear system theory, we note that a seismogram is 
not an exact representation of the ground motion. Seismograms 
depend upon the seismometer and the rest of the seismograph 
system, because the sensitivities of seismometers vary with the 
frequency of the motion recorded. Moreover, seismometers 
record ground motion as displacement, velocity, acceleration, 
or various combinations of these. 1 

Once recorded, distributing seismic data is crucial, because 
the data are of no use until they are available for study. Hence 
seismology has long been a leader among the sciences in 
developing public data distribution. This tradition began a 
century ago out of necessity. Unlike a geological field observa¬ 
tion or a geochemical experiment, observations at many sites 
are needed to locate and study earthquakes, with the more data 
the better. Soon after seismometers became sensitive enough to 
teleseismically record earthquakes, arrival times were shared. 
The first major attempt to gather and publish seismically 
recorded arrival times was the bulletin of the Bureau Central 
International de Seismologie (BCIS), which began in 1904. The 
International Seismological Summary (ISS) began publication 
in 1913, 2 and eventually became the Bulletin of the Inter¬ 
national Seismological Centre (ISC), now an authoritative 
source of earthquake locations. Not only arrival times but also 
polarities and amplitudes were disseminated, enabling the 
study of magnitudes and focal mechanisms. 

This sharing of data has been crucial to seismology’s growth. 
In the modern era, the World Wide Standardized Seismograph 
Network (WWSSN), which started in 1962, was the first means 
of globally sharing full seismic waveform data. Today, high- 
quality digital global seismic data are available through the 
Federation of Digital Broad-Band Seismographic Networks 
(FDSN), of which the stations of the US-sponsored Incorpor¬ 
ated Research Institutions for Seismology (IRIS) are a part. 
Data and results such as earthquake locations are also provided 
by national and regional data centers. Seismologists anywhere 
in the world need only a computer and access to the Internet 
to freely and conveniently obtain terabytes 3 of digital seismic 

1 This is analogous to the way animals see differently; the electromagnetic radiation 
is the same, but human eyes respond slightly differently than those of bears (which are 
very nearsighted), and entirely differently from the hexagonally tiled eyes of flies. 

2 Its original name was the Monthly Bulletin of the Seismological Committee of the 
British Association for the Advancement of Science. 

3 One terabyte (Tbyte) equals 10 12 bytes. 



Fig. 6.6-1 Pendulum seismograph consisting of a mass, a spring, and a 
dashpot. 


data, software to look at it, and a great deal of other earth¬ 
quake information. As much as any development in theory or 
seismometry, this free access to data and software is respons¬ 
ible for the remarkable growth of the field within the past 
century. Not only can scientists work more efficiently, but this 
openness has encouraged the sharing of data and models, and 
allowed comparison and testing of results. 

6.6.2 The damped harmonic oscillator 

The basic problem of seismometry is how to measure the 
motion of the ground using an instrument that is also on the 
ground. The traditional solution is to use an inertial, known as 
a pendulum, system, so that the motion of the pendulum is out 
of phase with the ground motion. Three orthogonal seismo¬ 
meters (vertical, north-south, east-west) can give a three- 
dimensional record of ground motion. A schematic vertical 
seismometer is shown in Fig. 6.6-1. The key elements of the 
system are the mass, the spring, and a dashpot, or damping 
device. We consider such a system in general, without concern 
for the mechanics of how it is actually implemented. 

This mechanical seismometer system is a damped simple har¬ 
monic oscillator. If the spring equilibrium length in the absence 
of ground motion is £ 0 , the spring exerts a force proportional to 
its extension from equilibrium as a function of time, §{£) - £ 0 , 
times a spring constant k. The dashpot, with damping constant 
<i, exerts a force proportional to the velocity between the mass 
(m) and the earth. So, for a ground motion u(t), 

mffm + urn +d^ + km - <y = o. 

dt 2 dt 


( 1 ) 






If we define |(?) - as the displacement relative to the 
equilibrium position, Eqn 1 becomes 

m^ + d^ + k^ =-mil, ( 2 ) 

or 

| + 2£<j + ft>j^ = -w, (3) 

where the single and do uble dots denote the first and second 
time derivatives, 0) Q = -yjk/m is the natural frequency of the 
undamped system, and the damping is described by £=dl{2m). 
This is a linear differential equation with constant coefficients 
that we encountered when we used a damped harmonic oscilla¬ 
tor as a model for anelasticity (Section 3.7.5). Thus Eqn 3 is the 
inhomogeneous (forcing term) version of Eqn 3.7.8, where the 
damping term e appeared as co q /2Q. To solve it, we assume that 

u(t) = e- i(0t and ^(t) = X(co)e~ im (4) 

and substitute Eqn 4 into Eqn 3 to yield 

X{(d){-co 2 - 2eico+ col)e~ tm = co 2 e~ l(0 \ (5) 

or 

X{cq) = ~q) 2 /{co 2 - col + 2ei(d), ( 6 ) 

which is the instrument response produced by a ground motion 
e im . 

X(co) is complex and can be written in terms of the amplitude 


and phase responses 


X( 0 )) = \X(0))\e i ^ C0 \ 

(7) 

where 


|X(ft>) | = ft) 2 /[(ft> 2 - col) 2 + 4e 2 ft) 2 ] 1/2 , 

(8) 

„ . i 2 ECO 

0(©)= tan 1 +/r. 

(9) 


or - 


As shown in Fig. 6.6-2, these functions have several interesting 
features. First, as the angular frequency of the ground motion, 
ft), approaches the natural frequency of the pendulum, ft> 0 , the 
amplitude response is large. This effect, called resonance , is like 
“pumping” a playground swing at its natural period. Thus the 
seismometer responds best to ground motion near its natural 
period. 

For frequencies much greater than the natural frequency, 
co» o 0 , | X(co) | —> 1, and 0(ft>) -» n, so the seismometer records 
the ground motion, but with the sign reversed. 4 To see why this 

4 To see this, quickly jiggle an object hanging by a rubber band and note that its 
motion is out of phase with your hand. 
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Fig. 6.6-2 Amplitude response | X(co) | and phase delay (f>(w) for a 
pendulum seismometer such as that shown in Fig. 6.6-1. 


occurs, consider Eqn 3. For (0 » ft) 0 , the | term is the largest 
term on the left-hand side, so | approximately equals u. Thus 
the seismometer responds to the ground displacement. On the 
other hand, for frequencies much less than the natural fre¬ 
quency, co« co 0 , | X(co) | —> 0) 2 /col , and (j){co) 0. Hence, in this 
case the seismometer responds to acceleration , as can be seen 
from Eqn 3, because the 0)q^ term is dominant, so £ is pro¬ 
portional to it. The shape of the instrument response depends 
on the damping factor h = e/co Q . For h = 0, the system is 
undamped, and the amplitude response is peaked around the 
resonant frequency, co= co 0 . The seismometer amplifies ground 
motion with periods near its natural period. As damping is 
increased, the curve is smeared out. Thus the natural period 
and damping are used to design a seismometer to record 
ground motion in a particular period range. 

Figure 6.6-2 bears a strong resemblance to Fig. 3.7-13, which 
showed the frequency response for a damped harmonic oscil¬ 
lator as a function of Q. The plots are slightly different, in 
that Fig. 3.7-13 is plotted as a function of ft), and Fig. 6.6-2 is 
plotted as co Q /co. In addition, Fig. 6.6-2 is normalized to the 
value at co 0 Ico = 0. However, the curves convey the same infor¬ 
mation because h and Q are related as h = 1/2Q. The Q values 
in Fig. 3.7-13 of 5, 15, and 100 correspond to h values of 0.1, 
0.03, and 0.005, all of which would plot close to the curve for 
h = 0 in Fig. 6.6-2. 
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6.6.3 Earth noise 

An important consideration in designing seismometers is earth 
noise. A challenge of seismometry is to create sensors sensitive 
enough to record small teleseismic signals, given that noise sets 
a limit to the level of detection. Moreover, studies using seismic 
data in many applications must consider the signal-to-noise 
ratio. 

Many factors contribute to seismic noise, including solar and 
lunar tides within the solid earth, fluctuations in temperature 
and atmospheric pressure, storms, human activities, and ocean 
waves. These factors are constantly at work, so the crust is 
continually reverberating. Most of the noise occurs at periods 
between 1 and 10 seconds. Such waves, called microseisms , are 
shown in Fig. 6.6-3 (top). Even before the first waves arrive 
from the earthquake shown, the seismogram shows a roughly 
constant level of seismic energy (center). The spectrum shows 
that most of this noise is in the frequency range of 0.1-0.2 Hz 
(periods of 5-10 s) (bottom). The primary source for these 
microseisms is thought to be ocean waves. Seismometers are 
noisier the closer they are to coastlines, so ocean island 
stations are among the noisiest. 

How a seismometer is deployed has a great effect upon the 
noise that it records. Most sources of noise decrease away from 
the surface, so permanent seismometer installations are often 
in boreholes. For portable seismometers, burying them even 
half a meter beneath the surface greatly reduces noise from 
daily temperature fluctuations. Rain generates high frequency 
noise, and wind, coupled to the ground through the roots of 
swaying trees, can generate severe long-period noise. Human 
activity (trucks, trains, machinery, etc.) causes significant 
ground noise, so seismologists deploying temporary stations 
face a trade-off between the convenience (continuous power, 
security, constant temperature, no flooding) of building base¬ 
ments and the lower noise of remote sites. 

6.6.4 Seismometers and seismographs 

Seismometers record ground motions ranging from large 
high-frequency accelerations near an earthquake to small 
ultra-long-period normal mode signals. Because no single 
seismograph can do this, different instruments have evolved 
to handle the different dynamic ranges and frequency ranges of 
seismic waves. 

Dynamic range is measured in decibels (dB), which increase 
by 20 for each order of magnitude increase in amplitude. Thus, 
if signal is five orders of magnitude larger than signal A 2 , 
A 1 !A 1 = 10 5 , and the dynamic range is 100 dB. The displace¬ 
ments associated with a magnitude 2 earthquake may be as 
low as 10“ 10 m, whereas teleseismic displacements from a 
magnitude 8 earthquake may be on the order of 10 _1 m, and 
displacements near a large earthquake can be much greater. 
Thus the dynamic range of seismometry is at least 180 dB. 
Similarly, the frequency range of seismometers spans seven 
orders of magnitude from Earth tides (0.000023 Hz) to ultra- 
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Fig. 6.6-3 Demonstration of seismic noise on a broadband seismogram 
in Hudson, New York, from an April 7,1995, Tonga earthquake. 

Top : Seismic noise appears before the first arrival, which is P diff' 

Center: Visual examination of the noise shows waves with a dominant 
period of about 5-6 s, called microseisms. Bottom: The spectrum 
of the noise has largest amplitude in the 5-10 s period range. 

high frequencies of greater than 200 Hz for very shallow struc¬ 
ture investigations. 

The earliest attempts to record the motions of earthquakes 
used seismoscopes , which differ from seismographs in that 
they record ground motion without time information. The first 
known seismoscope, built by the Chinese astronomer Chang 














Fig. 6.6-4 Two examples of seismoscope 
recordings, which show the amplitudes 
of motions without a record of time. 

Left: Seismogram of the great 1906 San 
Francisco earthquake, recorded by the 
Ewing duplex pendulum seismoscope in 
Carson City, Nevada. (Kanamori, 1988. 
Importance of historical seismograms 
for geophysical research, in Historical 
Seismograms and Earthquakes of the world , 
ed. W.H.K. Lee, H. Myers and K. Shimizaki, 
copyright 1988 by Academic Press, 
reproduced by permission of the publisher.) 
Right: Seismogram of a m b = 4.3 earthquake 
in Hawaii, recorded as a telescope image 
at the Hawaii Telescope Observatory. 

The dark images are stars, and the lines 
emanating from the large star at the upper 
center of the image result from tilting of the 
telescope during the earthquake. (Courtesy 
of L. Meech.) 




Heng in about ad 132, consisted of a pendulum inside a 6 ft- 
diameter jar. Eight dragons’ heads with metal balls in their 
mouths were placed around the rim of the jar, so the balls 
would drop in the direction from which seismic waves arrived. 
Later seismoscopes included a pendulum etching a path on a 
bed of sand (A. Bina, 1751), a collection system for a bowl 
filled to the brim with mercury (A. Cavalli, 1784), and optical 
reflection off a basin of mercury (R. Mallet, 1851). Two very 
different seismoscope recordings are shown in Fig. 6.6-4. 

Early seismometers, incorporating a record of the time- 
dependence of the ground motion, were purely mechanical 
instruments like that outlined in Section 6.6.2. Seismometry 
began with the designs of F. Cecchi around 1875, and devel¬ 
oped rapidly through the work of seismologists like J. Milne, J. 
Ewing, and T. Gray. The first teleseismic recording was by a 
seismograph in Potsdam of a Japanese earthquake in 1889. 
By the start of the twentieth century a global network of more 
than 40 seismographs was in operation. Such instruments often 
produced excellent data but responded best to very large earth¬ 
quakes because their magnifications were low, only about 100 
times the actual ground motion. 

Higher magnifications are achieved by using electromagnetic 
instruments, based on a design introduced by Galitzin in 1914 
that is now common. The motion of the pendulum relative to 
the frame is measured by moving a coil attached to the mass 
through the magnetic field produced by a magnet fixed to the 
seismometer frame. The voltage produced in the coil is pro¬ 
portional to the time rate of change of the magnetic field, and 
thus to the velocity of the mass relative to the frame (Fig. 
6.6-5). The sensitivity can be increased by feeding the output 
from this sensor into a galvanometer, a wire suspended by a 
thin fiber such that it is deflected by the current produced by the 
sensor (Fig. 6.6-6). A mirror is attached so that ground motion 
deflects the mirror and thus changes the position of a beam 



Fig. 6.6-5 Schematic illustration of an electromagnetic seismograph, in 
which the mass is coupled to an electromagnetic transducer. Motions of 
the mass move the coil through the magnetic field, generating an electric 
current. The voltage across the coil is proportional to the relative velocity 
between the mass and the magnet. 


of light hitting a piece of photographic paper. The paper is 
mounted on a helical drum which turns once per hour. 

Thus the response of an electromagnetic analog seismo¬ 
meter system is a combination of the pendulum, transducer 
(electromagnetic velocity sensor), and galvanometer responses. 
These are shown as log-log plots in Fig. 6.6-7. The pendulum 
response (Fig. 6.6~7a, b) is proportional to co 2 for co< &> 5 , the 
pendulum frequency. The transducer response (Fig. 6.6-7c, d) 
is proportional to (O because it responds to the velocity, 
the derivative of displacement. The galvanometer response 
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Fig. 6.6-6 Coupling of the transducer of an 
electromagnetic seismograph to a galvanometer, which 
deflects a mirror and thus a light beam, causing a time 
history of the voltage and thus the mass movements to be 
recorded on photographic paper. Timing pulses deflect 
the mirror to make minute and hour marks. 
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Fig. 6.6-7 Response of the components of an electromagnetic 
seismograph system. Left panels show the amplitude responses, and right 
panels show the phase responses. co s and co g are the pendulum and 
galvanometer frequencies. 


(Fig. 6.6-7e, f) falls off as or 1 for (o> co g , the galvanometer fre¬ 
quency. The combined effect is shown in Fig. 6.6-7g, h. Thus, 
the response of an electromagnetic seismometer can be “shaped” 
by choosing the pendulum and galvanometer periods. 

Two classic electromagnetic instruments used heavily for 
years were the World Wide Standardized Seismograph Network 



Frequency (Hz) 

Fig. 6.6-8 Frequency domain instrument responses for several types of 
seismometers. The SRO and DWSSN sensors have responses peaked at 
long periods and so do not record high-frequency signals. The STS-1, 
STS-2, and Guralp-3T sensors are broadband seismometers with a flat 
response over a wide range of frequencies. 


(WWSSN) long- and short-period instruments. The long- 
period (LP) instrument had a pendulum period of 15 s (30 s 
in some early versions) and a galvanometer period of 100 s. 
The short-period instrument had a 1 s pendulum and a 0.75 s 
galvanometer. Each WWSSN station had three LP and three SP 
instruments oriented to record ground motion in the vertical, 
east-west, and north-south directions. The resulting response 
curve of the LP instrument (labeled “DWWSSN” from when 
some of the WWSSN seismometers were converted to record 
digitally) is shown in Fig. 6.6-8. Instruments ran at several pos¬ 
sible magnifications (gains). The two different instruments 
were designed to reduce the effects of seismic noise. The LP 
sensors had peak sensitivity in the 10—40 s range, making them 
ideal for long-period teleseismic studies. The SP sensors were 
peaked at around 1 s, a good period with which to pick the 
travel times of P waves. 

A sample of the data is shown in Fig. 6.6-9. The record, 
covering 24 hours, has calibration pulses at the begin¬ 
ning, which can be used to check the amplitude and phase 
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Fig. 6.6-9 Sample WWSSN seismogram, showing the long-period vertical component from an earthquake in the Indian Ocean, recorded 36° away in 
Pakistan. 


calibration. Timing marks, generated by crystal clocks accur¬ 
ate to 1 part in 10 7 are placed at each minute (short mark) and 
each hour (longer mark). Every sixth hour has no hour mark. 
This timing allowed arrival times to be read accurately, and the 
calibration allowed studies using true amplitudes. The seismo¬ 
grams were microfilmed and made available to the seismolog¬ 
ical community. 

Although many results discussed in this text were derived 
from such data, using WWSSN data was cumbersome. Micro¬ 
fiche records had be acquired, examined in a microfiche reader, 
copied, and refiled. The traces were then digitized by taping 
them to a special table that contained a grid of electromagnetic 
wires and then tracing the seismogram with a cursor. After 
digitization, the seismogram was interpolated to a desired sam¬ 
pling rate. The hand digitization added a source of error, as it 
was not always easy to follow the trace of interest, especially 
for large earthquakes where the surface waves could wrap 
around the seismic record for several hours. Because of the 
effort involved, entire Ph.D. dissertations might involve the 
analysis of only tens or hundreds of seismograms, a task that is 
now done in minutes to days. 

The replacement of analog seismographs by digital broad¬ 
band instruments has important advantages. The newer 
seismometers provide better data over a broader frequency 
band, and the digital data are available via magnetic tape, 
compact disk, or the Internet, making computer analysis much 
easier. Routine processing, such as rotating into radial and 
transverse components and making record sections, has be¬ 
come nearly trivial. Large volumes of data are available and 
can be processed easily. For example, as of 2000 the IRIS Data 
Management Center had over 7 Tbytes of digital data available 
over the Internet either immediately or with only the short 
delay needed for it to be read from mass storage systems. 

Some of the technology involved in more recent seismograph 
systems is illustrated by one of the first digital seismological 



Closed-loop output 
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Fig. 6.6-10 Block diagram of the sensing and feedback electronics of 
an IDA gravimeter recording system. (Agnew et al., 1976. Eos Trans. 

Am. Geophys. Un., 57, 180-8, copyright by the American Geophysical 
Union.) 

systems, the instrument used by the International Deployment 
of Accelerometers (IDA) shown schematically in Fig. 6.6-10. 
The sensor is a force-feedback gravimeter that detects vertical 
ground motion by the resulting change in gravity. The gravi¬ 
meter mass is connected to the center plate of a capacitor whose 
outer two plates are fixed. As the mass moves, the voltage 
between the center plate and the outer plates is proportional to 
the displacement. A 5 kilohertz alternating voltage applied to 
the outer plates is amplitude-modulated (Section 2.8.1) by the 
lower-frequency seismic signal. The modulated signal is fed 
to an amplifier that generates a voltage proportional to the 
displacement of the mass. This signal then goes to an integrator 
circuit whose output is proportional to the acceleration of the 
mass. This is the seismic system’s output, which is sampled 
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once every ten seconds. The voltage is also fed back to the outer 
capacitor plates to stabilize the system and increase linearity. 
This force-feedback, an important feature of modern seismo¬ 
meters, provides a greater dynamic range because the mass 
does not move as far to record large amplitudes. Because 
this instrument can record a static displacement, it has a flat 
response out to frequencies approaching co = 0. Such long- 
period response is valuable for studying normal modes and 
large earthquakes. 

The most versatile of the current digital seismometers are 
broadband systems that record over a very broad frequency 
range. At present, the primary broadband seismometers are the 
Streckheisen STS-1 and STS-2 and Guralp-3T, which use force- 
feedback technology to allow large dynamic and frequency 
ranges (Fig. 6.6-8). The advantages of such a broad frequency 
response are illustrated in Fig. 6.6-11. As shown, the seismo¬ 
gram can be filtered to isolate and give excellent records of two 
very different overlapping signals. These seismometers are very 
compact (the three-component STS-2 is the size of a bowling 
ball and weighs 20 lb) 5 but record with a flat response at over 
three orders of magnitude in frequency. The STS-1 is designed 
for permanent installation, whereas the STS-2 and Guralp-3T 
are robust enough to be used as portable instruments. 

A variety of specialized seismic instruments are also used. 
Strainmeters are used to measure gradual displacements, especi¬ 
ally near faults and volcanoes. Such instruments are technically 
challenging to build, and have taken unusual forms. For 
instance, an early strainmeter made by H. Benioff consisted of 
a quartz rod 24 m long, attached to the ground at one end, 
and extending through a capacitance transducer at the other. 
Strain rates as small as 10~ 15 s _1 could be recorded. A recent 
strainmeter with a hydraulic sensor achieves a strain sensitivity 
of 10~ 12 with a dynamic range of about 130 dB. Over longer 
distances, horizontal strains are observed using laser measure¬ 
ments between sites (often across faults) and space-geodetic 
techniques (Section 4.5), including the GPS satellite system and 
very long baseline radio interferometry. 

At the other end of the spectrum of seismic instrumentation 
are strong-motion sensors that record strong shaking near 
an earthquake. Whereas strainmeters record minute dis¬ 
placements, strong-motion sensors, also called accelerometers, 
record accelerations up to 2 g without breaking or going off 
scale. For example, horizontal accelerations of 1.25 g were 
recorded 3 km from the 1971 San Fernando Valley earthquake, 
and vertical accelerations of 1.74 g were recorded 1 km from 
the 1979 Imperial Valley earthquake. Thus the seismometer 
pendulum frequency co 0 is chosen to exceed the highest fre¬ 
quency of interest (about 20 Hz). These instruments are stable 
because the small pendulums make the accelerometers less 
susceptible to tilt and drift than longer-period instruments. A 
damping parameter (often 0.7 of the critical value) is chosen to 

5 Before such technology, some mechanical seismometers built in the first half of 
the twentieth century weighed more than 20 tons because the large mass gave higher 
long-period magnification, as shown by Eqn 6. 
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Fig. 6.6-11 STS-2 broadband seismogram recorded in Slippery Rock, PA, 
from a July 3,1995, Tonga earthquake. Because the seismometer records 
a wide range of frequencies, the same seismogram can be used to study 
both local and teleseismic events, (a): The original broadband record. 

(b): The same record, low-pass filtered at a frequency of 0.03 Hz, showing 
the long-period teleseismic signals from the Tonga event, (c): The record 
high-pass filtered at 0.5 Hz, showing the high-frequency signals from a 
local event, (d): A zoom-in of the high-pass filtered record shows the full 
waveform of the local event. The S — P time suggests that the event was 
20 km away from the station, probably a local quarry blast. 



















Fig. 6.6-12 Diagram showing the analog-to-digitai 
(ADC) process. The analog part of the system consists of 
the generation of a seismic signal by the seismometer, its 
amplification, and analog anti-aliasing (AAA) filtering. 
The digital part of the system consists of sampling the 
AAA-filtered signal, filtering the signal further with a 
digital anti-aliasing (DAA) filter, and then decimating 
the signal to achieve the desired sampling rate. 
(Scherbaum, 1996, with kind permission from Kluwer 
Academic Publishers.) 




give a response curve that is flat and directly proportional to 
ground acceleration from periods of zero to the natural period 
of the seismometer. 

A major advance in seismometry has been in timing, which 
has long been a difficulty. In the early days of seismology, 
timing errors played a large part in the mislocation of earth¬ 
quakes. However, seismometers now receive time signals from 
GPS satellites, whose atomic clocks are accurate to a billionth 
of a second. Similarly, although ocean bottom seismometers 
cannot receive GPS signals, accurate clocks for them are now 
available. 

6 .6.5 Digital recording 

Although digital seismic data are easier to use than analog data, 
the conversion of continuous ground motion into a digital 
seismogram is not a trivial matter. Figure 6.6-12 shows how 
this is done. Ground motion, represented by the waveform at 
the left, is detected by the seismometer through the motion 
of the mass. This motion is converted into an analog electrical 
signal and then amplified. To avoid a spurious signal due to 
aliasing (Fig. 6.4-3), a combination of anti-aliasing filters is 
used. Many seismometers use an initial frequency domain low- 
pass filter as an analog anti-aliasing (AAA) filter. The filtered 
signal is then oversampled at a rate that is at least twice the 
frequency of the AAA filter in order to avoid aliasing. This 
signal is then convolved with a digital anti-aliasing (DAA) 
filter, often called a finite impulse response (FIR) filter, and 
finally resampled at twice the desired Nyquist frequency. 

An example of a FIR filter is shown in Fig. 6.6-13a, with the 
resulting signal shown in Fig. 6.6-13c. The FIR filter maintains 
the shape of the pre-filtered signal, but introduces spurious 
noncausal arrivals that might be mistaken for early stages of 
earthquake rupture. These precursory signals result because 
the FIR filter’s impulse response is an emergent signal. This 
effect can be removed by correcting the phase of the FIR filter 
to make it causal (Fig. 6.6-13b). This filter does not cause pre¬ 
cursory signals (Fig. 6.6-13d), but the shapes of the waveforms 
are changed. We noted a similar phenomenon in Section 3.7.8, 
where anelasticity acted as a filter, removing high frequencies 
and making the waveforms noncausal unless the phase was 
changed. As discussed in Section 6.3.3, there is no perfect way 
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Fig. 6.6-13 Example of a FIR filter, a type of DAA filter, and its effects. 
When the FIR filter (a) is used for the digital anti-aliasing, the resulting 
signal (c) retains the wave shape of the original signal, but is preceded by 
high-frequency artifacts. When a phase-corrected FIR filter (b) is applied 
instead, the precursory effects vanish (d), but the seismic signal is phase- 
shifted from the original. (After Scherbaum, 1996, with kind permission 
from Kluwer Academic Publishers.) 
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to filter a seismic signal, so we decide what we seek and what 
we will accept as a consequence. 

Because the seismogram depends on the instrument response 
that is convolved with the ground motion, obtaining the 
ground motion requires specifying the frequency response of 
the seismometer. This can be done by giving the amplitude 
and phase response as a list of the values at each frequency. A 
more compact representation gives the frequency response as a 
complex fraction like 

L 

pii (*<»-*,■) 

T(i(o) = ~J^ -. (10 ) 

«rh f ®-p*> 

k=i 

The fraction is described by a set of L complex zeros Zj at 
which the numerator is zero, N complex poles p k at which the 
denominator is zero, and the constants p and a. Because the 
frequency terms ico are always imaginary and the poles always 
contain a real part, the denominator never becomes zero, 
avoiding any singular values. 

The instrument responses in Fig. 6.6-8 were calculated 
from the poles and zeroes of the seismometer responses. For 
example, the STS-1 response has three zeroes, all equal to 
(0, 0), and four poles, which come as complex conjugates: 
(-0.0123, 0.0123), (-0.0123, -0.0123), (-39.1800, 49.1200), 


(-39.1800, -49.1200). These poles provide the corner frequen¬ 
cies and determine the sharpness of the corners. Similarly, the 
DWWSSN response has five zeroes and 11 poles. 

Seismometers record combinations of ground displacement, 
velocity, or acceleration, depending upon the application. In a 
strong-motion seismometer, the displacements may be greater 
than the size of the instrument itself, so accelerations are 
measured to keep signals on scale. This makes sense because 
accelerations are primarily responsible for damage to struc¬ 
tures and so are considered in strong-motion studies. At the 
other end of the frequency spectrum, strainmeters are used to 
study slow tectonic displacements. In fact, if they measured 
accelerations, the signals would be so small as to be unusable. 
Most other branches of earthquake seismology fall in between, 
using the waves from distant earthquakes, and so use seismo¬ 
meters that record ground velocity. 

Although different instruments record displacement, velo¬ 
city, or acceleration, it is simple to convert between them. For 
instance, given a velocity record, the acceleration is found by 
taking the derivative of the seismogram, and the displacement 
record is found by integrating. This is easily done in the 
frequency domain, because if F{co) is the Fourier transform of 
f{t), then icoF{co) is the transform of df(t)/dt , and -co 2 F(co) is 
the transform of d 2 f{t)ldt 1 (Section 6.2.4). Thus, a velocity 
seismogram can be converted to acceleration by multiplying the 
complex value of its transform at each frequency by ico, or to 
displacement by dividing by ico. Of the three, the displacement 
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Fig. 6.6-14 Demonstration in the time domain of the relation between displacement, velocity, and acceleration, (a): A synthetic example, consisting 
of delta function-like acceleration pulses. The velocity and displacement signals are obtained through successive integrations of the accelerogram. 

(b). A real example, with an accelerogram recorded on the first floor of a building in Los Angeles during the 1971 San Fernando earthquake. The velocity 
and displacement records were obtained through successive integrations of the accelerogram. (Krinitzsky et al, 1993. Fundamentals of Earthquake 
Resistant Construction. Copyright © 1993. Reprinted by permission of John Wiley & Sons, Inc.) 
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seismogram has the greatest power at low frequencies, and 
the acceleration seismogram has the greatest power at high 
frequencies. In general, displacements have lower frequencies 
than velocities, and velocities have lower frequencies than 
accelerations, because integration “smoothes” a signal, whereas 
differentiation makes it “rougher.” 6 

Figure 6.6-14a illustrates this relation with three different 
versions of the same seismogram. If an accelerogram consists of 
high-frequency spikes {top), then smoother lower-frequency 
velocity ( center) and displacement ( bottom ) traces result from 
integrating once and twice. Figure 6.6-14b shows this effect for 
a strong-motion seismogram of the 1971 San Fernando earth¬ 
quake, where the velocity and acceleration records have higher 
frequencies than the displacement. It is common in earthquake 
engineering to show the response of a structure to ground 
motions using a plot that shows the displacement, velocity, and 
acceleration. Figure 6.6-15 shows this formulation for the data 
in Fig. 6.6-14b. This representation uses the relation between 
the Fourier transforms expressed above, so the velocity scale is 
vertical, whereas the acceleration and displacement scales have 
opposite slopes as a function of frequency. 

6.6.6 Types of networks 

Most seismic experiments require multiple seismometers that 
are deployed in networks or arrays. Different applications, 
such as studying regional and global earth structure, resource 
exploration, seismicity monitoring, or identifying nuclear tests, 
lead to different deployment geometries. In some cases a 
unique network of stations is used for a particular application, 
but often an existing network has a geometry that is a com¬ 
promise for different objectives. 

Although the division is somewhat artificial, deployments of 
seismometers are often divided into global networks, regional 
networks, and arrays. Global networks are used to study global 
patterns of seismicity, plate tectonics, mantle convection, and 
earth structure. For these purposes seismometers should ideally 
be spread evenly around the world. This means, however, that 
the station spacing is too sparse to resolve the entire wave 
field. 7 Instead, individual measurements at separate stations 
are combined for applications including locating earthquakes, 
3-D tomography, and waveform analyses. 

The antithesis of a global network is a local array, where a 
set of seismometers is deployed with a geometry chosen for a 
particular goal. Array data are often analyzed as a single entity, 
as in refraction and reflection studies (Sections 3.2 and 3.3). 


6 An analogy might be to compare displacement and velocity to the topography and 
gradient of a mountain. A kilometer of topography over a horizontal wavelength of a 
meter would be very unusual, but a kilometer of topography over a longer wavelength 
of 5-10 km would be a normal mountain. Similarly, large vertical gradients are rare at 
the scale of mountains (El Capitan in Yosemite and the Jungfrau in Switzerland are 
exceptions), but common at the higher spatial frequency scale of meters, as where a 
path goes over a boulder. 

7 By analogy to time series, such undersampling is termed spatial aliasing. 
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Fig. 6.6-15 Demonstration in the frequency domain of the relation 
between displacement, velocity, and acceleration. In this example, taken 
from the accelerogram in Fig. 6.6-14b, a site response spectrum of the 
building housing the strong-motion seismometer is given as displacement, 
velocity, and acceleration. The multiple curves show the amplitude of the 
building response at various levels of damping, with the undamped curve 
at the top, and successive levels of damping at 2%, 5%, 10%, and 20% of 
critical damping. (Krinitzsky etal, 1993. Fundamentals of Earthquake 
Resistant Construction. Copyright © 1993. Reprinted by permission of 
John Wiley &c Sons, Inc.) 


Other examples are arrays used to locate distant nuclear tests. 
Data from the array stations are stacked to track the propaga¬ 
tion of the wave field across the array, so the wave vector shows 
the direction the waves came from and the distance they have 
traveled. One of several exceptions to this division between 
global networks and arrays is normal mode seismology, where 
all the stations of a global network are sometimes used as a 
single array. 

Between global networks and arrays are regional networks, 
which usually focus on the seismicity or structure of a par¬ 
ticular region. The data are sometimes analyzed with array 
techniques, but are more often combined as individual meas¬ 
urements (such as arrival times or amplitudes) in the same way 
as global network data. 

6 . 6.7 Global networks 

The global network of seismometers has a rich history. At the 
start of the twentieth century there were already seismometers 
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Fig. 6.6-16 Station map of the Federation of Digital Broad-Band Seismographic Networks (FDSN) as of 1999. (Courtesy of the Incorporated Research 
Institutions for Seismology.) 


in locations around the world, operated by groups including 
many Jesuit institutions. Devastating earthquakes such as the 
1906 San Francisco and 1923 Tokyo events spurred the instal¬ 
lation of seismometers and the interchange of data. Bulletins 
of earthquake locations were published by several agencies, 
the most notable being the ISS/ISC bulletin (Section 6.6.1). By 
mid-century, the ISS received arrival times from several hun¬ 
dred stations for very large earthquakes. However, there were 
problems due to a lack of standardization. Different types of 
seismometers were used, with a wide range in the quality of the 
response, timing, and station operation practices. As a result, 
earthquake locations were often poor, and focal mechanisms, 
which require accurate information about polarities, were 
rarely derived. 

These problems were largely solved with the creation of the 
World Wide Standardized Seismographic Network. WWSSN 
seismometers were standardized and had known responses. 
The network was installed, starting in 1961, to monitor nu¬ 
clear testing within Eurasia, and had a high density of stations 
around the borders of the Soviet Union, China, and Eastern 
Europe. The WWSSN, which reached its peak of about 120 
stations in the late 1960s, gave a great boost to geophysics. 
Several great earthquakes in the 1960s, such as the 1964 Alaska 
earthquake, provided excellent sources for seismic investiga¬ 


tions. WWSSN data were crucial for advances in plate tectonics, 
earthquake source studies, and global velocity structure. 

The first digital stations began to be deployed in the 1970s. 
Over the next two decades, the number of permanent digital 
seismometers increased gradually. Following the phase-out of 
the WWSSN, these became part of the Global Digital Seismic 
Network, the primary means of global broadband data collec¬ 
tion between 1977 and 1986. The GDSN was enhanced by 
the network of IDA gravimeters, beginning in 1977, and by the 
French GEOSCOPE network, which has deployed broadband 
seismometers since 1982. 

In 1986, the GDSN gave way to the IRIS Global Seismo¬ 
graphic Network (GSN) program, which incorporates many 
borehole seismometers with an aim toward global coverage, 
with 128 stations spaced about 2000 km apart. These are 
extremely quiet, permanent broadband seismic stations of the 
highest quality. The GSN is part of a larger Federation of 
Digital Broad-Band Seismographic Networks (FDSN) that also 
includes the US National Seismographic Network (NSN) and 
networks from other countries including Canada (CNSN), 
China (CDSN), France (GEOSCOPE), Germany (GEOFON), 
Italy (MEDNET), Japan (Pacific 21), and Taiwan (BATS). 
FDSN station locations are shown in Fig. 6.6-16. Some 
FDSN stations are also part of the International Monitoring 














System (IMS) network used to monitor nuclear testing (Section 

1 . 2 . 8 ). 

Although the present global network of broadband seismo¬ 
meters relies on land sites, it is hoped that the global network 
will soon include permanent ocean bottom seismometers 
(OBS), especially in the Southern Hemisphere, where there 
is much less land, and coverage is currently very uneven. 
Although OBS instruments are currently used mostly for tem¬ 
porary deployments, the technology is evolving to the point 
where permanent sites are practical. 

An important aspect of the different networks of high-quality 
broadband seismometers is considerable standardization in 
data processing and formatting. All 7 terabytes of seismic data 
archived by the IRIS DMC 8 as of 2000 are available in a format 
called SEED (Standard for the Exchange of Earthquake Data), 
which is the standard for the FDSN. SEED data can be con¬ 
verted into whatever format an investigator requires. 

It was not until the mid-1990s, more than 30 years after the 
start of the WWSSN, that the global number of permanent 
digital broadband seismometers surpassed the number of 
WWSSN stations at its heyday. However, digital data from all 
parts of the FDSN can be retrieved as if it were a single array, 
making it more powerful than the WWSSN for seismic ana¬ 
lyses. Many stations now report in real time through satellite 
telemetry, so seismic signals arrive at data centers a fraction 
of a second after they occur, allowing better quality control. 
Efforts are being made to eventually have all GSN stations 
report in real time, which will be important for applications 
like tsunami warning. Software has been developed to take 
real-time data from different networks and display it on 
the Internet as if it were from a single array. Hence, anyone 
with a computer and access to the Internet will soon be able 
to examine global seismic data within seconds of them being 
recorded. 

6.6.8 Arrays 

For global networks, the precise configuration of individual 
stations is less important than the total coverage. However, 
the geometries of seismic arrays are optimized for certain 
investigations. Arrays can be linear, two-dimensional, and 
even three-dimensional, incorporating borehole seismometers 
(Fig. 7.3-8). 

There is always a trade-off between the benefits of linear 
versus two-dimensional arrays. The same number of stations, 
and therefore cost and time for installation, provides greater 
resolution if deployed in a linear manner, but the resulting 
two-dimensional “slice” into the earth does not image the third 
dimension. Linear arrays have long been the mainstay of active 
source reflection and refraction experiments. 9 A marine linear 

8 Because all data are duplicated in a sort order, and also stored off site, the com¬ 
puter storage needed is four times greater, or 28 Tbytes. 

9 Active experiments include their own seismic sources, as opposed to passive 
experiments using earthquake sources. 
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array is easily deployed by towing hydrophones behind a ship, 
and similar linear deployments are used for land-based studies. 
These data are analyzed using techniques discussed in Sections 
3.2 and 3.3. 

Linear arrays are most useful if the structure being invest¬ 
igated varies most in one direction, as is often the case at plate 
boundaries. For instance, Fig. 5.3-10 ( bottom) showed the 
seismic structure of the East Pacific rise obtained from an array 
of OBSs. Because the structure of the lithosphere changes much 
more significantly perpendicular to the ridge than parallel to 
it, most of the OBSs were deployed in a line crossing the 
ridge. Most of the remaining seismometers were placed in a 
second line, parallel to the first. Both lines were aligned along 
a great circle path to the seismogenic zones of Tonga and 
South America, so as to maximize the chance of obtaining good 
signals from distant earthquakes. Similarly, at subduction 
zones and transform faults structure varies more significantly 
across the plate boundary than along it, so refraction lines are 
often placed perpendicular to the boundary. For example, Fig. 
3.2-17 showed a cross-section of the western US lithosphere 
perpendicular to the San Andreas fault that was derived from 
refraction surveys. 

Two-dimensional arrays can create a three-dimensional 
image of a small region. As a result, two-dimensional arrays 
have been deployed around hot spots, rifts, plateaus, transform 
faults, and subduction zones to study their structure and tec¬ 
tonics. Reflection data are also now commonly gathered by 
two-dimensional surface deployments. An important contribu¬ 
tor to this development has been advances in computers and 
graphics software that make it possible to analyze and model 
such data and display the resulting earth structure in a compre¬ 
hensible fashion. Such three-dimensional images are of great 
importance in exploring for oil and gas and managing existing 
oil and gas fields. 

Special two-dimensional arrays, often consisting of short- 
period vertical seismometers, have been used to monitor the 
locations and magnitudes of underground nuclear tests. The 
most ambitious such array was the circular Large Aperture 
Seismic Array (LASA), which operated in Montana from the 
mid-1960s until 1978. LASA was an array of arrays totaling 
525 high-frequency vertical seismometers. Twenty-one clus¬ 
ters of 25 seismometers, each covering 7 km 2 , were deployed 
with a total array diameter of 200 km (Fig. 6.6-17). A similar 
array is the Norwegian Seismic Array (NORSAR), built in 
1971, with 22 sub-arrays spanning an area of 100 km 2 . Part 
of NORSAR, the NORESS array, has 24 seismometers dis¬ 
tributed within a 3 km-diameter circle. It has counterparts in 
northern Norway, Finland, and Germany. As with the WWSSN, 
arrays designed for nuclear monitoring have also been import¬ 
ant for studies of earth structure. Array data can be stacked 
(Section 6.5), allowing small seismic signals to be extracted 
from noise. The characteristics of the inner core boundary were 
first quantified using stacked array data for PKiKP waves, which 
reflect at the boundary but are rarely identified on individual 
seismograms due to their small amplitudes. 
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Fig. 6.6-17 Seismometer geometry of the 
Large Aperture Seismic Array (LASA). 
(Capon, 1969. J. Geophys. Res., 74 ,3182- 
9 4, copyright by the American Geophysical 
Union.) 


6.6.9 Regional networks 

Regional networks, intermediate between global networks and 
arrays, are usually constructed to monitor local seismicity or 
volcanism. Including Alaska, Hawaii, and Puerto Rico, over 
3200 seismic stations are part of more than 40 separate US 
networks (Fig. 6.6-18). Some have only a few stations, and 
some have hundreds. Many use short-period vertical sensors, 
but some use accelerometers. For example, the California 
Strong-Motion Instrumentation Program operates more than 
400 accelerometers to provide data for earthquake engineers. 
Strong-motion data also provide excellent information on 
source properties because much of the seismic signal is severely 
attenuated at teleseismic distances. Some networks also incor¬ 
porate broadband seismometers. For instance, as of 2000, the 
Southern California Seismographic Network operated 79 broad¬ 
band stations in addition to its 163 short-period instruments. 
Regional network stations can also be valuable for earth struc¬ 
ture studies, as shown in Fig. 6.6-19. 

Many countries have regional networks. For instance, as 
of 1999, Japan had about 560 stations in operation. These 
stations have provided valuable data about the subduction 
process there, including the double seismic zones (Fig. 5.4-20) 
and ScS-to-P conversions at the slab top (Fig. 2.6-15). 

Regional networks, like global networks, are continually 
being upgraded. In the USA there are efforts under way, as part 


of the Advanced National Seismic System (ANSS), to install 
more broadband and short-period seismometers, and to add 
about 6000 strong-motion sensors in urban areas at risk from 
damaging earthquakes. A very ambitious network planned is 
the USArray, which would have three different components 
operating simultaneously. First, the number of permanent 
broadband stations would be increased (Fig. 6.6-20, left). 
Second, 400 portable broadband seismometers would travel 
around the country. Over eight years, this “bigfoot” array 
would visit about 2000 sites in the continental USA, with 
an average station spacing of about 70 km, before going to 
Alaska and Hawaii (Fig. 6.6-20, right). Third, about 2400 
seismometers (a mix of broadband, short-period, and high- 
frequency sensors) would be used as flexible arrays to accom¬ 
pany the moving array. As planned, USArray will be an array at 
the scale of a regional network. Data from the moving array 
will be available in near-real time, and can be processed using 
migration techniques to attain high-resolution imaging deep 
into the mantle. 

Interestingly, because there is an increasing trend toward 
real-time telemetry for transmitting data from the sensors, 
seismology is moving toward a situation where data from 
global networks, regional networks, and many local arrays 
can be easily combined, largely eliminating the distinctions 
between networks. This development offers great scientific 
opportunities. 





Fig. 6.6-18 Map of regional network seismometer stations in the continental USA as of 1999. Some networks are cooperatively operated with Canadian 
and Mexican institutions. 


Fig. 6.6-19 Records from the short-period 
seismometers of California regional 
networks for an Oct. 17,1990, earthquake 
in South America. The data reveal distinct 
reflections off the sharp 410 km and 
660 km mantle discontinuities. The ability 
to examine large amount of data over a 
small geographical region greatly increases 
the resolution of earth structure. (Benz and 
Vidale, 1993. Reproduced with permission 
from Nature.) 
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Fig. 6.6-20 Seismometer locations for the proposed US Array, heft : Solid triangles would be new permanent seismometers to augment the existing US 
National Seismic Network (open triangles). Right : Possible locations of 2000 sites that the moving array of 400 broadband seismometers would 
eventually cover. (Courtesy of P. Shearer.) 


Further reading 

Because of its widespread use, an excellent literature is available both 
for signal processing in general and for geophysical applications. These in¬ 
clude introductory texts by Rabiner and Rader (1972), Claerbout (1976), 
Bracewell (1978), Robinson and Treitel (1980), Kanasewich (1981), and 
Hatton et al. (1986). Brigham (1974) discusses the FFT in detail. 


Error analysis in the physical sciences is the subject of many books, 
including Bevington and Robinson (1992). Seismological texts, especially 
Aki and Richards (1980) and Lay and Wallace (1995), discuss seismolog¬ 
ical instrumentation. Scherbaum (1996) addresses seismometry, especially 
digital, from a signal processing viewpoint. 


Problems-*■>- 

1. Find the coefficients analytically of the Fourier series for the 
functions 

(a) A step: 

f(t)= 1 0 < f < 1/2 

-1 -l/2<t<0. 

(b) A ramp: f{t)~t for -1/2 <t< 1/2. 

2. Use the formulae for the product of sine and cosine functions 
(Section A.2) to prove the orthogonality relations for the sine and 
cosine functions (Eqns 6.2.2~4). 

3. Express the following complex numbers in a + ib form: 

(a) e m 

(b) 4e i7tl2 

(c) e- ina 

(d) 3e M3 

4. In the Fourier series (Eqn 6.2.1), no b Q term is given. Why? 

5. Show that 

(a) The Fourier transform is linear: if P{co) and G{co) are the 
transforms of/(*) andg(f), then {aF(co) + bG(a>)) is the trans¬ 
form of (af(t) + bg(t)). 

(b) The Fourier transform of a purely real-time function has the 
symmetry F(-co) -F*(co). 

(c) The total energy in a Fourier transform is the same as that in 
the corresponding time series (Parseval’s theorem): 

\f(t)\ 2 dt = ~ [ \F{co)\ 2 d(a. 



6. If F{ca>) is the Fourier transform of f{t), show that the following are 
also transform pairs: 

(a) f{t-a) and e~ 1C0a F(w), 

(b) F(co-a) and e tai f(t), 

(c) df/dt and icoF{co). 

7. For f(t} = sin co Q t, 

(a) Find the Fourier transform. 

(b) Compare it to the Fourier transform of f{t) = cos 0 ) Q t. 

(c) Explain what operation (filter) in the frequency domain 
could be used to convert the Fourier transform of sin co 0 t to 
that of cos 0) Q t. 

(d) Explain how the relation between the Fourier transforms of 
sin (O 0 t and cos 0) 0 t could be derived using the fact that one 
function is a time-shifted version of the other. 

8. Show that if f(t) and F{co) are a transform pair, the inverse trans¬ 
form of F( w) yields f{t). 

9. Use the propagation of errors relation (Eqn 6.5.18) to show how 
the uncertainty in the following functions of several variables 
depends on the variances and covariances of the variables u and v, 
where a and b are constants: 

(a) z = au + bv, 

(b) z = auv, 

(c) z = au/v, 

(d) z = au b . 

10. For the discrete Fourier transform and inverse discrete Fourier 
transform, show that: 

(a) The DFT and IDFT are linear: if A(k) and B(k) are the trans¬ 
forms of time series a(n) and b{n), then aA{k) + fiB{k) is the 
transform of aa{n) + fib{n). 
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(b) The DFT of a real-time series has the symmetry F(-k) = 
F(N-k)=F*(k). 

(c) If the DFT of f(n) isF{k), the DFT of f{n~j) is W k ^F{k), 
andthelDFTof F(k-m) is W~ mn f{n), where W=e~ 27tilN . 

11. As derived in Eqn 4.3.10, the depth b of an earthquake can be 
estimated from the difference in arrival times St between the direct 
P wave and pP , the P wave reflected from the surface, using 
St = (2 h cos i)/v where i and v are the incidence angle and velocity. 

(a) Express the depth as a function of the parameters St, v, i. 

(b) Find the depth for a measured time difference of 2.7 s and 
assumed velocity of 6.8 km/s and incidence angle of 24°. 

(c) Use the propagation of errors relation to show how the 
uncertainty in depth depends on the uncertainties of the 
three parameters. 

(d) Use the results of (c) to find the uncertainty in depth cor¬ 
responding to uncertainties (one standard deviation) of 0.5 s 
in time difference, 0.5 km/s in velocity, and 3° in incidence 
angle. (Remember to convert to radians.) 


C-2. Write a subroutine to prepare a time series for taking the fast 
Fourier transform and take it. The subroutine should call a set 
of separate subroutines that extend the time series to a power of 2 
as required, allow for a taper of a length which you input, take 
the FFT using the subroutine (COOLB) provided (Box 6C-2) 
or another, and plot the amplitude spectrum. The subroutine 
should have the option to list the real and imaginary parts of 
the spectrum, and the amplitude and phase spectra, at each 
frequency. 

2.7lt 

C-3. (a) Write a subroutine to generate values of the function sin 

from t = 0 to t = T max , where the time step At, the period T, 
and the total data length T max are inputs. 

(b) Plot this function for At = 0.25, T - 5, T max = 20. 

(c) Use the results of C-2 to find the amplitude spectrum, with no 
tapering and with 10% and 20% tapering. 

(d) Do parts (b) and (c) for At- 0.25, T = 8, T max = 50. 

(e) Do parts (b) and (c) for the function 


Computer problems 

C-l. Using the Fourier series coefficients for the step function, derived 
in problem la, plot the first ten terms of the series and their sum. 
Also plot the sum of the first 20 and 30 terms. 


. 2m . 

sin —+ (0.5) sin 


2m 

IP 


with At= 0.25, T max = 256. 


Box6C-2 COOLB subroutine. 1 

SUBROUTINE COOLB(NN,DATAI, SIGNI) 

C CLASSIC - BUT USABLE - FFT PROGRAM 

C DATAI IS DATA ARRAY, 2*NP REAL NUMBERS REPRESENTING 
C NP COMPLEX POINTS, SO EACH PAIR OF POINTS ARE THE 
C (REAL, IMAGINARY) PARTS OF A COMPLEX NUMBER. 

C NN IS POWER OF TWO, CAN BE FOUND BY 
C NN-{ALOG10(FLOAT(NP))/ALOG10(2.))+.99 
C TRANSFORM DIRECTION CONTROLLED BY REAL VARIABLE 
C SIGNI (SIGN OF EXPONENTIAL):-l. FORWARD, 1. TO 
C INVERT. 

C DIMENSIONS: IF TIME SERIES HAS TIME INCREMENT DT, 

C TRANSFORM HAS DELTA FREQ=1/(2**NN*DT) 

C NOTE: AFTER TAKING INVERSE FFT DIVIDE OUTPUT BY 2**NN 
INTEGER NN 
REAL SIGNI 
DIMENSION DATAI(1) 

N-2 * *(NN+1) 

J = 1 

DO 5 1=1,N,2 
IF(I-J)1,2,2 
1 TEMPR=DATAI(J) 

TEMPI=DATAI(J+l) 


DATAI 

(J)=DATAI(I) 

DATAI 

(J+l)=DATAI(I 

DATAI 

(I)=TEMPR 

DATAI 

(1+1)=TEMPI 

2 M=N/2 



IF(J-M)5,5,4 


4 J=J-M 
M=M/2 

IF(M-2)5,3,3 

5 J=J+M 
MMAX=2 

6 IF(MMAX-N)7,10,10 

7 ISTEP=2 *MMAX 

THETA=SIGNI*6.2831831/FLOAT(MMAX) 
SINTH=SIN(THETA/2.) 

WSTPR=-2.0 *SINTH*SINTH 
WSTPI=SIN(THETA) 

WR=1. 

WI = 0 . 

DO 9 M=1,MMAX, 2 
DO 8 I=M,N,ISTEP 
J=I+MMAX 

T EMPR=WR * DATA I (J)-WI*DATAI (J+l) 
TEMPI =WR*DATAI (J+l)+WI*DATAI (J) 
DATAI (J) =DATAI (I )-TEMPR 
DATAI (J+l) =DATAI (I + 1)-TEMPI 
DATAI (I) =DATAI (I )+TEMPR 

8 DATAI (1 + 1) =DATA (1 + 1 )+TEMPI 
TEMPR=WR 

WR=WR*WSTPR-WI*WSTPI+WR 

9 WI=WI*WSTPR+TEMPR*WSTPI+WI 
MMAX=ISTEP 

GO TO 6 
10 RETURN 
END 


1 COOLB, written in 1960s vintage Fortran (note the arithmetic IF statements), has been left in original form to illustrate both the persistence of programs that 
work and the advantages of subsequent developments in programming practice and documentation (Section A.8.2). 
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C-4. (a) Write a subroutine, using the results of C-2, to use the fast 
Fourier transform to take a time series, filter it in the fre¬ 
quency domain over a specified passband, and invert the FFT, 
yielding a filtered time series. The subroutine should have the 
capability to taper in the frequency domain. This subroutine 
is best written as a set of subroutines. 

(b) Use this routine to filter the time series in C~3e to isolate the 
two different frequency components. 


C-5. (a) Write a subroutine, using the results of C-2 and C-4, to use 
the fast Fourier transform to convolve two time series. 

(b) Use it on two boxcar functions of unit amplitude, one 6 s long 
and one 3 s long. 

C -6. (a) Write a subroutine to do time domain convolution of two 
functions of different lengths, both sampled at a time step At. 
(b) Use it on two boxcar functions of unit amplitude, one 6 s long 
and one 3 s long. Compare the results to those of C-5b. 




/ Inverse Problems 


Most people, if you describe a train of events to them, will tell you what the result would be. There are few people, however, who, 
if you told them a result, would be able to evolve from their own inner consciousness what the steps were which led up to that result. 
This power is what I mean when I talk of reasoning backwards. 


7.1 Introduction 

Throughout this book we have noted that seismology is largely 
directed at solving inverse problems dealing with earthquake 
sources and earth structure. We start with the end result, 
seismograms, and work backwards to characterize the earth¬ 
quakes that generated the seismic waves and the medium 
through which the waves passed. To do this, we first addressed 
the forward problems of how features of seismic waves that are 
observable from seismograms, such as travel times, amplitudes, 
waveforms, eigenfrequencies, dispersion, and attenuation, 
depend on the seismic source and the medium. We have also 
discussed how the properties of the medium and the source, 
such as velocity structure and earthquake mechanisms, reflect 
tectonic processes within the earth. These are specific examples 
of the fundamental question of what we can say about the earth 
from seismological and other observations at its surface. 

We now end our discussions by addressing some issues in 
solving inverse problems. Inverse problems can be posed by 
assuming that we understand the physics of a process which, 
for a set of model parameters described by a vector m, gives rise 
to a set of observed data described by the vector d. The data 
can thus be considered the result of a function, or operator, A 
acting on the model parameters, 

d = A(m). (1) 

The forward problem, predicting the data d that would result 
from a given model described by m, is tractable if we under¬ 
stand the process. The corresponding inverse problem, finding 
what gave rise to a specific set of observed data, is more diffi¬ 
cult. We assume that some physical model describes the pro¬ 
cess, and then use the data to estimate a set of model parameters 
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that are consistent with the data. We solve the inverse problem 
using either mathematical inverse techniques to find m directly 
from d, or trial-and-error techniques that solve the forward 
problem repeatedly and look for the best solution. Each 
approach has advantages in some applications. 

We have already mentioned solving inverse problems in 
contexts including studying the cooling of oceanic lithosphere 
using surface wave dispersion (Section 2.8.3), inverting travel 
time and amplitude data to find earth structure (Chapter 3), 
inverting polarity, waveform, and geodetic data to study 
earthquake mechanisms (Chapter 4), and using earthquake 
mechanisms to study plate motions and regional tectonics 
(Chapter 5). We have noted (Section 1.1.2) that although for¬ 
ward problems typically can be solved in a straightforward 
way, giving a unique solution, inverse problems often have 
no unique, exact, or “correct” solutions. Because the data are 
generally somewhat inconsistent due to errors, and our models 
simplify complex reality, no model exactly describes the data. 
Similarly, a range of parameters can describe the data equally 
well for a given model, and we have various models to choose 
from based on various criteria and preconceptions. Moreover, 
the data are often insufficient to resolve aspects of the model. 
We can thus only recognize and accept these limitations on the 
solutions. 1 

A consequence of these limitations is a trade-off between 
the model’s resolution , how detailed it is, and its stability , or 
robustness. For example, inverting travel times with simple 
earthquake location algorithms using a laterally homogeneous 
velocity model shows the Wadati-Benioff zones of dipping 
seismicity. These results are stable, in that they do not depend 

1 This situation is summarized by the title of a paper “Interpretation of inaccurate, 
insufficient, and inconsistent data” (Jackson, 1972). 
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Table 7.1-1 Some large-scale reference models. 


Model for 

Observables inverted and predicted 

Parameters estimated 

Misfits ("anomalies") indicate 

Laterally homogeneous 
earth structure 

Travel times, eigenfrequencies 

Average velocity and density 
versus depth 

Lateral velocity variation (subduction 
zones, continental-ocean differences, 
etc.) 

Relative plate motions 

Rates and azimuths of plate motion 

Euler vectors 

Nonrigid plate behavior (plate 
interiors and boundary zones) 

Thermal evolution of 
oceanic lithosphere 

Variation with age in depth, heat 
flow, and geoid 

Plate thickness, asthenospheric 
temperature, physical 
properties (e.g., a, k, k) 

Lateral thermomechanical variations 
(swells, etc.) 


significantly on the details of the location algorithm and velo¬ 
city model, but have only limited resolution for where in the 
slab the earthquakes occur. More detailed locations, which 
are more useful for relating the earthquakes to the physics 
of subduction, can be derived from sophisticated location 
algorithms using a laterally variable velocity model that better 
represents the slab. However, the improved resolution comes 
at the price of stability, in that it depends on the specific velo¬ 
city model used. 

The results of inverse studies can be viewed in terms of 
two end members. In one, we use an individual set of data to 
characterize a specific phenomenon, such as the location of an 
earthquake or the velocity structure in a specific area. In others, 
we describe a set of data averaged over a region or the whole 
earth with a simple physical model characterized by a relatively 
small, or sparse , set of parameters. Such reference models — the 
physical model with a specific set of parameters — are used 
to characterize large sets of data in a simple way, predict data 
where no observations exist, and thus identify misfits, or 
“anomalies,” where the data deviate from the model predictions 
and hence the global average. We then use reference models 
to draw inferences about the processes that give rise to both the 
average situation and deviations from it. For example, body 
wave, surface wave, and normal mode data give average global 
velocity structure. This structure is used to constrain models of 
the average radial variations in composition and temperature, 
and as a reference against which velocity perturbations due to 
subducting slabs, continental roots, hot spots, ridges, etc. can 
be identified and analyzed in terms of local processes that per¬ 
turb the global model. As shown in Table 7.1-1, we can view 
other reference models in a similar way. For example, the Euler 
vectors describing a plate’s motion are a simple description 
of its behavior, and places where earthquake mechanisms 
differ from these predictions indicate deviations from rigid 
plate behavior. Similarly, simple cooling models of the oceanic 
lithosphere describe the average variations in depth, heat flow, 
and the geoid, and so give a reference model for the temperat¬ 
ure against which other effects can be identified and modeled. 

As illustrated in Fig. 1.1-8, the models are refined over time 
using new data and model parameterizations. Eventually, 


the reference model does not improve significantly. When this 
occurs, we are probably doing about as well as possible with 
this type of model. For example, as discussed in Section 3.5, 
laterally homogeneous global seismic velocity models have 
become sufficiently accurate that more attention is now dir¬ 
ected toward the lateral variations. 

In this chapter, we discuss several inverse problems to intro¬ 
duce some of the methods used. Because such inverse prob¬ 
lems are crucial to seismology and the earth sciences, and also 
appear in other sciences, considerable attention has been dir¬ 
ected toward them. It turns out that physically quite different 
problems are often described in mathematically similar ways. 
Our goal is to identity some common themes and approaches, 
rather than discuss the details. Some more sophisticated treat¬ 
ments are listed in the suggested reading. 

7.2 Earthquake location 

We first consider the classic inverse problem of locating an 
earthquake and finding its origin time using the arrival times of 
seismic waves at various stations. The velocity structure, which 
determines the ray paths and hence travel times, is crucial. We 
first regard the velocity structure as known, and then explore 
how it can also be estimated from the travel times. 

7.2.1 Theory 

Assume that an earthquake occurred at an unknown time t, at 
an unknown position x = (x, y, z), known as the hypocenter , 
or focus (Fig. 7.2-1). The point (x, y) on the surface above the 
focus is called the epicenter, n seismic stations at locations 
x / = (x i} y it zf detect the earthquake at arrival times d-, which 
depend on the origin time t and the travel time between the 
source and the station T(x, x-): 

d'i - T(x, x-) + 1. (1) 

If the velocity structure is known, the forward problem can be 
written using the formulation 
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Epicenter 


Station i 

(Xi.yi, 0 ) 


Earthquake 
focus 
Of, y, z) 

Fig. 7.2-1 Geometry for earthquake location in a homogeneous (uniform 
velocity) halfspace. 


d = A(m), or d f = A(m y ), (2) 

showing how the data vector, containing the arrival times at 
the stations, can be computed from an assumed model vector 
composed of the source location and origin time, 

m ={x,y,z,t). (3) 

The model vector consists of physically different quantities: 
three space coordinates and an origin time. Because the data 
and model are vectors, relations between them can be written 
in terms of either vectors (d = A(m)) or their components 
(d—Aimj)), 

The inverse problem can be stated as: given the observed 
arrival times, find a model that fits them. To do this, we begin 
with a starting model m°, which is an estimate of (or guess at) 
a model that we hope is close to the solution we seek. The 
starting model predicts that we would have observed data 
d° = A{m°). Unless we are lucky, these predicted data are not 
what were actually observed. Hence we seek changes A m- in the 
starting model 

m- = m° + Am- (4) 

that will make the predicted data closer to those observed. In 
general, the data do not depend linearly on the model para¬ 
meters, so we linearize the problem by expanding the data in a 
Taylor series about the starting model m° and keeping only the 
linear term, 



This equation can be written in terms of the difference between 
the observed data and those predicted, 


**■***$*£■ 


Such relations are common in inverse problems. For simpli¬ 
city, we omit the superscripts and define the partial derivative 
matrix as 


r =iA 

" dm t ’ 

so the equation becomes 

Ad - GAm, or Ad i = G- ; Am -. 


(7) 


( 8 ) 


Often the As are also suppressed, and the equation is written as 
d = Gm. This makes the notation simpler, but can be confusing 
at first. In this derivation, we retain the As to explicitly indicate 
changes. 

Equation 8 is a vector-matrix equation representing a sys¬ 
tem of simultaneous linear equations. To solve it, we seek a 
change in the model Am that, when multiplied by the known 
partial derivative matrix G, gives the required change in the 
data Ad. This is an inverse problem, in contrast to the forward 
problem of finding the change in the data Ad predicted by an 
assumed change Am in the model. Many aspects of inverse 
theory deal with solving such equations under various cir¬ 
cumstances. The earthquake location problem considered here 
is a simple case. 

A common complexity Is that we generally have arrival time 
observations at many (often several hundred) seismic stations, 
and are solving for only four model parameters. In the notation 
of Eqn 8,; ranges from 1 to 4, and i ranges from 1 to «, where n 
is much greater than 4. Because each arrival time corresponds 
to one equation, and each model parameter provides one 
unknown, G has a number of rows equal to the number of 
arrival time observations, and a number of columns equal to 
the number of model parameters. Because there are more (n) 
equations than unknowns (4), G has more rows than columns, 
so Eqn 8 looks like 


A m 1 
A m 2 
Am 3 
A m 4 


Such over determined problems can pose difficulties. One way to 
see this is to recall that if n were equal to 4 the matrix G would 
be square (have the same number of rows and columns), so 
Eqn 8 could be solved by multiplication by the inverse matrix, 

G _1 Ad = G _1 GAm = Am, or 




H G ki Ad i = 'L G ki X G ;; Ara / = Am k- aO) 

* * W 

If the number of arrival time observations exceeds four, 
this method cannot be used, because G is not square and thus 
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does not have an inverse. 1 Our first instinct might be to use only 
arrival times at four stations, which would give an exact solu¬ 
tion, and assume that the arrival times at the other stations 
give only extra, redundant information. In an ideal world this 
would be the case. In reality, the arrival times contain errors 
due to a variety of possible effects, including reading errors, 
inaccuracies in the clocks at the stations, and misidentification 
of the first arrivals. In addition to these errors of measurement, 
there are systematic errors due to the fact that the velocity 
structure is not perfectly known and is laterally variable. As a 
result, the equations are inconsistent: no one model can solve 
them exactly. Moreover, choosing four arrival times might 
mean selecting data poorer than those discarded. The approach 
taken instead is to seek the origin time and source location that 
“best” solve the overdetermined, inconsistent equations. 

To do this, we regard the observations d\ as having errors 
described by their standard deviations <J i and find the model 
that minimizes the misfit, 



( V* 

H-'L G n Am i 

V i 


( 11 ) 


which is the prediction error, the normalized sum of the 
squares of the difference between the observed arrival times 
and those predicted by the model, x 2 , the fitting function to be 
minimized, weights the data by the reciprocal of their variances 
so that the most uncertain have the least effect. To find the best 
fit, we set partial derivatives of the misfit with respect to the 
change in model parameters A m k equal to zero, and use the 
fact that the model elements are independent, so the partial 
derivative of the change in one with respect to those in the 
others is zero, 


or, in matrix notation, 

G T Ad = G T GAm. (16) 

To see that XAd-G^ = G T Ad, whereas %G^Am- = GAm, con¬ 
sider the dimensions. 

The advantage of this form is that although the matrix G 
cannot be inverted, the matrix G T G is square and can be 
inverted. Equation 16 thus gives Am, the standard least squares 
solution to a set of equations that cannot be solved exactly, 
because 

Am= (G T G)“ 1 G T Ad= G^Ad, or Am-^^GjfAd,. (17) 


The operator (G T G) _1 G T , which acts on the data to yield the 
model, is called the generalized inverse of G, and is written 
as G~ 8 . It provides the “best” solution in a least squares sense, 
because it gives the smallest squared misfit. The generalized 
inverse is the analog of the inverse, but for a matrix that is not 
square, and hence does not have a conventional inverse. If G is 
square and has an inverse, then G" 1 = G~ 8 . If the data errors are 
not equal, the least squares solution is weighted by the errors, 
as shown in problem 5 at the end of this chapter. 

To use this method, we begin with a starting model (source 
location and origin time) m° and predict the values expected 
for the data, d° = A(m°). We then form the residual vector giv¬ 
ing the misfit to the data, Ad° = d' - d°, evaluate the matrix of 
partial derivatives about the starting model, 


, _ddj_ 
ij dm i 


5 

m° 


(18) 


dA m- 
dAm k 


and use the generalized inverse (Eqn 17) to find Am°, the 
(12) change in the starting model that gives a better fit to the data. 
Thus the new model 


The partial derivatives of the misfit are 



or 



G ik* 


(14) 


If the variances of the data are equal (of = O' 2 ), that term can be 
factored out, and 


X Ad i G ik - X 


G ij Am j 


G ib , 


(15) 


1 The definition of the inverse (Section A.4.3) requires that both pre- and post¬ 
multiplication yield the identify; i.e., A -1 A = AA _1 = I. 


m 1 -m° + Am° 


(19) 


predicts values of the data 


d 1 = Atm 1 ) (20) 

that should be closer to the observations than the predictions 
of the starting model. This can be tested by computing the dif¬ 
ference between the observations and the predicted data for the 
new model Ad 1 = d' - d 1 , and examining the total squared misfit 
X(A dj) 2 ~ X(d- - dj ) 2 . This should be less than the correspond¬ 
ing misfit for the starting model X(Ad°) 2 . The total squared 
misfit is more useful than the total misfit XA d i3 because the 
latter could be small for large misfits of opposite signs. 

We can often do even better. Remember that the G matrix of 
partial derivatives was found by expanding the function that 
predicts the data (travel times) about the starting model in a 
Taylor series, and taking the linear terms. This expansion 
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True model Starting model 

Model 



0 1 2 3 4 5 

Iteration number 


Fig. 7.2-3 Schematic illustration of the variation in misfit to the data as a 
function of iteration number for an inverse problem. 

occurred at time t at location x = (x, y, z ) are recorded by seis¬ 
mic stations at positions x f = (x i9 y if z-} with arrival times 


Fig. 7.2-2 Schematic illustration of the effect of linearizing about a 
starting model in an inverse problem. The new model is found from the 
difference between the observed data and that predicted for the starting 
model. The worse the linear approximation is, the more iterations will be 
needed to reach the true model. 


works well if the starting model is “close” to the actual model. 
If this is not the case, the linear approximation may not be a 
good one. Figure 7.2-2 illustrates this idea schematically. The 
actual situation is hard to draw, because each model vector is 
an element in a four-dimensional (three space and one time) 
vector space. 

As a result, the method can be iterated. Once the model has 
been changed, a new partial derivative matrix 


G- 


ddj 

dm- 

' m 1 


( 21 ) 


is found by expanding the function that predicts the data about 
the new model. The generalized inverse method is then used to 
solve 


Ad 1 = GAm 1 (22) 

for a further change in the model Am 1 that reduces the remain¬ 
ing misfit. This process is repeated until successive iterations 
produce only small changes in the model, and hence in the total 
misfit to the data (Fig. 7.2-3). 


d { = T(x, x.) + t=-[{x~ x -) 2 + (y - y t ) 2 + {z~~ z ; -) 2 ] 1/2 + 1. (23) 

v 

Although the earthquake can occur below the surface, the 
stations are at the surface z { = 0. The travel times depend only 
on the distance between source and receiver, | x - x f |. 

To solve the inverse problem, we form the matrix G-. Its ele¬ 
ments, the partial derivatives of the elements of the data vector 
d i (the arrival times at each station) with respect to the model 
parameters m- (the location coordinates and origin time of the 
earthquake) are easily found. Differentiation of the i th element 
of the data vector is done with respect to the first element of the 
model vector, which is the x coordinate of the location 

q _ dd i _ dd; _ 9T(x, X-) 
ll dm 1 dx dx 

= — — — [(* - **) 2 + (y - y t ) 2 + z 2 Y m . (24) 

V 

Similar expressions give the partial derivatives with respect to 
the other two space coordinates of the location. Note that these 
partial derivatives are functions of the spatial model para¬ 
meters (x, y, z). The final partial derivative, with respect to 
origin time, is just 


dd t _ dd { 

dm 4 d t 


(25) 


7.2.2 Earthquake location for a homogeneous medium 

To make these ideas less abstract, we consider the simple case 
of locating an earthquake in a medium of uniform velocity v. In 
this case the ray paths connecting an earthquake and seismic 
stations are straight lines. This geometry approximates a situ¬ 
ation where the receivers are close enough to a source that the 
first arrivals are direct waves in a medium whose velocity does 
not vary significantly. Seismic waves from an earthquake that 


Given the G matrix, the earthquake is located by choosing a 
starting model, forming the difference Ad between the model 
predictions and the observations, and solving for the change in 
the model Am using the procedure in the last section. 

Table 7.2-1 (top) illustrates a hypothetical example of locat¬ 
ing an earthquake with ten stations located within a 100 km 
square. The earthquake is assumed to have occurred at time 0 
seconds at the point (0, 0, 10) km. We then try to locate the 
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Table 7.2-1 Earthquake location example with error-free data. 


Invert for location and origin time 


model evolution 



parameter 

actual value 

model for iteration number 



0 

1 

2 

X 

0.0 

3.0 

- 0.5 

0.0 

y 

0.0 

4.0 

- 0.6 

0.0 

z 

10.0 

20.0 

10.1 

10.0 

origin time 

0.0 

2.0 

0.2 

0.0 

station location 

residual for iteration number 



0 

1 

2 

35.0 

9.0 

- 2.1 

- 0.4 

0.0 

- 44.0 

10.0 

- 3.0 

- 0.2 

0.0 

- 11.0 

- 25.0 

- 3.8 

- 0.1 

0.0 

23.0 

- 39.0 

- 3.0 

- 0.2 

0.0 

42.0 

- 27.0 

- 2.6 

- 0.3 

0.0 

- 12.0 

50.0 

- 2.0 

- 0.3 

0.0 

- 45.0 

16.0 

- 2.9 

- 0.2 

0.0 

5.0 

- 19.0 

- 3.7 

- 0.2 

0.0 

- 1.0 

- 11.0 

- 4.1 

- 0.2 

0.0 

20.0 

11.0 

- 2.4 

- 0.4 

0.0 

error 


92.4 

0.6 

0.0 


Invert for location , origin time , and velocity 


model evolution 


parameter 

actual value 

model for iteration number 



0 

1 

2 

X 

0.0 

3.0 

0.2 

0.0 

y 

0.0 

4.0 

0.3 

0.0 

z 

10.0 

20.0 

10.2 

10.0 

origin time 

0.0 

2.0 

0.7 

0.0 

velocity 

5.0 

4.0 

4.9 

5.0 

station location 

residual for iteration number 



0 

1 

2 

35.0 

9.0 

- 4.0 

- 0.9 

0.0 

- 44.0 

10.0 

- 5.6 

- 1.0 

0.0 

- 11.0 

- 25.0 

- 5.7 

- 0.9 

0.0 

23.0 

- 39.0 

- 5.6 

- 1.0 

0.0 

42.0 

- 27.0 

- 5.2 

- 1.0 

0.0 

- 12.0 

50.0 

- 4.6 

- 0.9 

0.0 

- 45.0 

16.0 

- 5.6 

- 1.0 

0.0 

5.0 

- 19.0 

- 5.2 

- 0.9 

0.0 

- 1.0 

- 11.0 

- 5.3 

- 0.9 

0.0 

20.0 

11.0 

- 3.8 

- 0.8 

0.0 

error 


261.3 

8.3 

0.0 


earthquake using the computed arrival times at the ten stations 
as “data.” For a starting model, we assume the earthquake 
occurred at time 2 seconds at (3, 4, 20) km. As discussed in the 
previous section, we compute the arrival times expected at each 
station for a source located at the initial estimated position and 
time, and then form the residual, the difference between the 
“data” and this prediction (Eqn 6). For the starting model, the 
total squared misfit is 92.4 s 2 . 


To reduce the misfit, we form the partial derivative matrix 
G;j evaluated at the starting model, and use the generalized 
inverse (Eqn 17) to solve for Am 0 , the change in the starting 
model that would best fit the residuals. This change gives a 
source location of (-0.5, -0.6, 10.1) km and an origin time of 
0.2 s. This new estimate is close to the true values. Because for a 
real case the true model would not be known, the new model 
is tested by calculating the expected arrival times, forming the 
residuals, and examining the total squared misfit, which is 
reduced to 0.6 s 2 . To reduce this further, we form the partial 
derivative matrix evaluated at the new model and iterate again. 
The resulting change in the model yields the true model exactly, 
which fits the data perfectly. 

This success is hardly surprising, because the data had no 
errors. We could thus have used any four data to find the 
model, and avoided the generalized inverse. Before turning to 
discuss the errors, note that the same procedure could be 
used to find the velocity. To do so, we regard the velocity as a 
fifth model parameter, and invert the data for a model vector 
m = (x, y, z, t, v). The additional partial derivatives are 

^- = ^ = -\i(x-x i ) 2 + (y-y i ) 2 + z 2 ] m . (26) 

om 5 ov v l 

We thus assume a velocity as part of the starting model, find the 
partial derivative matrix (which now has five columns), and 
use the generalized inverse to find the changes in the starting 
model. Table 7.2-1 ( bottom) illustrates this process for the 
same example as before, except that we also invert for velocity. 


7.2.3 Errors 

Because earthquakes are located using arrival time data that 
have errors, the resulting locations and origin times have uncer¬ 
tainties. To assess these uncertainties, we examine how errors 
in the data affect the generalized inverse solution. 

We characterize the errors in the data at the z th station, d-, by 
viewing the specific values measured as samples from a parent 
distribution that includes all possible k = 1, • • . such 
that an infinite number of measurements would yield the par- 
ent distribution. In this notation, df^ is the k th sample of the 
arrival time at station i. Because in real applications the parent 
distribution for d { is unknown, it is common to assume a 
Gaussian distribution with mean d i and standard deviation O}, 
as discussed in Section 6.5. For a large number of measure¬ 
ments (samples) from this distribution, the mean is the average 


lim^TdfK 


(27) 




-'V/nAA-- 


7.2 Earthquake location 421 


If the Gaussian parent distribution is an appropriate choice, 
there is a 68 % probability that any sample will fall in the range 
d- ± Op and a 95 % probability that any sample will fall in the 
range 2a;-(Fig. 6.5-1). 

The errors at different stations are described by the variance- 
covariance matrix of the data 



K -*°° K k=l 


(29) 


The diagonal {i = j) terms are the variances for data at 
individual stations. The off-diagonal terms (i & j) are the 
covariances that describe the relation between errors at pairs 
of stations. If the errors are uncorrelated between two stations 
— for example, those due to a station clock — then how a 
measurement at one station differs from the mean there is 
unrelated to what occurs at another station, so their covariance 
is ideally zero. Given a finite number of real data, we expect the 
covariance to be small. By contrast, if the errors are correlated 
(for example, if one person were reading seismograms from 
different stations with a consistent bias), then similar devi¬ 
ations from the mean occur between these stations, and their 
covariances would be larger. Errors can also be anti-correlated, 
such that deviations at a station tend to occur in one direction, 
whereas those at another station tend the other way, yielding 
negative covariances. Although errors of measurement are likely 
to be uncorrelated, systematic errors are often correlated. For 
example, variations in velocity can cause systematic biases that 
are either correlated or anti-correlated between different stations. 

The data are inverted using the generalized inverse solution 

w /=]L G /v*4 ( 3 °) 


(here the As are not written). As a result, the uncertainty in 
a model parameter can reflect errors in all of the data. Thus, 
even if the errors in the data are uncorrelated, the resulting 
uncertainties in model parameters can be correlated. To see 
this, we write the covariances of the model parameters in terms 
of those for the data 


o 2 = a 2 = lim — Y (mW - mMmf ' 1 - m.) 

m m i> v . V / r 1 1 




limil XG£f(4«-J ( 

K->0 ° K k=l V P 


X G 7f( d f ] ~ d s 


V 5 



This relation can be written in matrix form in terms of a\ and 
<7 2 , the variance-covariance matrices for the data and model: 




o 2 m =GSo 2 d (G-z) T . (32) 

We often assume that the data errors are uncorrelated and 
equal, so that the data variance-covariance matrix is a con¬ 
stant times the identity matrix, 

<rl = o% (33) 

and the model variance-covariance matrix is 
<jI=oHG t G)-\ (34) 

as proved in problem 4. 

Table 7.2-2 illustrates these ideas for the location example in 
the previous section. In this case, Gaussian errors with mean 
zero and standard deviation 0.1 s were added to the arrival 
times. As a result, the data are inconsistent and cannot be fit 
exactly by any model. The inversion thus changes the model 
until a good, but not perfect, fit to the data is achieved. This 
final model, which is no longer changing much after three 


Table 7.2-2 Earthquake location example with errors. 


Invert for location and origin time 


model evolution 




parameter 

actual value 

model for iteration number 



0 

1 

2 

3 

X 

0.0 

3.0 

-0.2 

0.2 

0.2 

y 

0.0 

4.0 

-0.9 

-0.4 

-0.4 

z 

10.0 

20.0 

12.2 

12.2 

12.2 

origin time 

0.0 

2.0 

0.0 

-0.2 

-0.2 

station location 

residual for iteration number 



0 

1 

2 

3 

35.0 

9.0 

-2.0 

-0.1 

0.1 

0.1 

-44.0 

10.0 

-3.0 

-0.1 

0.0 

0.0 

-11.0 

-25.0 

-3.8 

0.0 

0.1 

0.1 

23.0 

-39.0 

-3.2 

-0.1 

0.0 

0.0 

42.0 

-27.0 

-2.8 

-0.2 

-0.1 

-0.1 

-12.0 

50.0 

-2.1 

-0.3 

-0.1 

-0.1 

-45.0 

16.0 

-2.9 

-0.1 

0.0 

0.0 

5.0 

-19.0 

-3.7 

-0.1 

0.0 

0.0 

-1.0 

-11.0 

-4.0 

-0.1 

0.0 

0.0 

20.0 

11.0 

-2.5 

-0.3 

0.0 

0.0 

error 


93.74 

0.33 

0.04 

0.04 


data standard deviation 

0.10 



model variance 

-covariance matrix 



0,06 

0.01 

0.01 


0.00 


0.01 

0.08 

-0.13 


0.01 


0.01 

-0.13 

1.16 


-0.08 


0.00 

0.01 

-0.08 


0.01 



model standard deviation 



X 

y 

z 

origin time 


0.25 

0.28 

1.08 


0.10 
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iterations, is close to, but not exactly, the model used to gener¬ 
ate the data. This simple example thus has some features of real 
situations. 

The uncertainties in the final model are shown by the model 
variance-covariance matrix 
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( 35 ) 


To see that the results seem reasonable, we compare the final 
inversion model, taking into account its uncertainty, to the true 
model. The standard deviations of each parameter are given by 
the square roots of the diagonal terms of the model variance- 
covariance matrix, so the final model (x = 0.2 ± 0.25 km, 
y - “0.4 ± 0.28 km, z = 12.2 ± 1.08 km, t = -0.2 ± 0.10 s) is an 
acceptable representation of the true model. 

The model variance-covariance matrix shows some interest¬ 
ing features. The variance of the depth estimate, a\ z is larger 
than the corresponding terms a 2 x and a 2 y , indicating that the 
depth is less well constrained than the epicenter. This situation 
is common, and arises because all the seismometers are at the 
surface. 2 In some cases when the depth is poorly constrained, 
it is regarded as fixed, and only the epicenter and the origin are 
inverted for. The results of multiple inversions, each with the 
depth fixed at a different value, are compared to see which best 
fits the data. It is also possible to determine the depth from 
other criteria, such as the times of surface reflections (Section 
4.3), and then invert with the depth fixed. 

The uncertainties in the model parameter estimates are 
correlated, because the off-diagonal elements of the model 
variance-covariance matrix are nonzero. <r^, the covariance of 
the depth and origin time uncertainties, is negative, indicating 
a trade-off between the focal depth and the origin time. At any 
station, similar arrival times result if the earthquake occurred 
earlier [t smaller) but deeper (z larger). Similarly, cr^, the 
covariance of the x and y location uncertainties, is nonzero, 
so the uncertainties in these two parameters are correlated. 
A method often used to illustrate this is to extract the 2 x 2 
submatrix 


<7yy 

a 2 

XX 

xy 

(7 2 

a 2 

yx 

yy) 


(36) 


and diagonalize it by finding the eigenvalues A (1) and A (2) , and 
the associated eigenvectors (x { f\ x ( ^) and (xf\ x { 2 2) ). The uncer¬ 
tainty in the epicenter can then be thought of as an ellipse with 
semi-major and semi-minor axes X {l)m and 2 (2 > 1/2 , oriented in a 
direction given by tan -1 (x^/x^). In this case, the semi-major 


2 Similarly, vertical positions determined using the GPS (Section 4.5.1) by a process 
analogous to earthquake location are less precise than the horizontal positions. 


and semi-minor axes have lengths of 0.29 and 0.24 km, and the 
semi-major axis trends N22°E. An interesting feature of the 
error ellipse is that its shape and orientation depend on the 
(G t G) - 1 matrix, whereas the variance of the data, a\, con¬ 
trols the size of the ellipse. Because the shape of the error ellipse 
depends on the geometry of the receivers, it can be examined 
without reference to specific data. As written, the ellipse is for a 
confidence level of la (68%), but ellipses are sometimes also 
given for 2 a (95 %), or 3 a (99 %). 

We have shown that the model variance-covariance matrix 
depends on the variance-covariance matrix of the data. In the 
example, we knew the standard deviation of the data and that 
the errors were uncorrelated. This information would not be 
available for a real experiment. However, we could estimate 
the standard deviation of the data from the misfit between the 
data and the best-fitting model, given by the sample variance s 2 


* 2 =— 


n-k i=1 


: 37 ) 


Here, d • are the observations, d i are the values of the data pre¬ 
dicted by the best-fitting model, and k is the number of model 
parameters determined from the data. Division by n-k, the 
number of degrees of freedom, rather than by n, the number 
of data, compensates for the improvement in fit resulting from 
the use of model parameters determined from the data. Thus, 
for our example, the final squared misfit is 0.4 s 2 , and four 
parameters were determined from the data, so the sample 
standard deviation is s = (0.4/(10 - 4)) 1/2 = 0.08 s, a value close 
to the true cr, 0.1 s. 

7.2.4 Earthquake location for more complex geometnes 

This formulation is not restricted to locating earthquakes in 
a homogeneous halfspace. Velocity variations can be incor¬ 
porated in the function relating the arrival time at the z th station 
to the origin time t and travel time T(x, x z ), 

d;=T(x, X/ ) + E (1) 

For example, a model for locating local earthquakes could have 
a series of layers. As a result, even for a source at the surface, 
the travel time curve is a more complicated function of distance 
(Section 3.2). At close distances, the first arrival is the direct 
wave. At greater distances, the first arrival becomes a head 
wave from an interface at depth, with the relevant interface 
being deeper as the distance increases. The situation is similar, 
but more complicated, for a source at depth, because at zero 
distance the travel time is nonzero. 

The travel time curve can be found either analytically or by 
tracing rays. If the receivers are on the surface at (x z , y ? ), the 
travel time curve T(r, z) depends on the horizontal distance 
between source and receiver. 


r.= [(x-x z ) 2 + (y-y.) 2 ] 172 


(38) 
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Fig. 7.2-4 Map view of the relation between an earthquake epicenter and 
a seismic station in Cartesian coordinates. 


and the source depth z> so the arrival times are 
d' i = T(r i ,z) + t. 

In this case, the % derivatives are found by 
ddi dT(r h z) dT(r h z) dr s dT(r { , z) (x - x t ) 


dx 


dx 


dr dx dr 


(39) 


(40) 


and similarly for the y derivatives. If f is the azimuth from the 
source to the receiver (Fig. 7.2-4), 


(x - x,)/r, = -sin Q and {y - y i )lr i = -cos Q. 


(41) 


If the travel time curve is found numerically, then T[r { , z) is a 
set of values for various points {r, z) rather than an explicit 
function. The procedure for location is still the same, except 
that the x, y, and z partial derivatives are computed numeric¬ 
ally. For example, if we begin by assuming that the source is at 
{x°, y°, z°), then the partial derivative with respect to r about 


r^[(x°-x.) 2 + (y°-y f ) 2 ] 


21112 


(42) 


is found using the tabulated travel times for points (r° + 8/2, z°) 
and (r° - 8/2, z°). Thus the x derivatives are found by approx¬ 
imating the derivative by a difference 

dTh, z°) dT(r { , z°) dq 

dx dr dx 


T{r° + 8/2, z°) - T(r° - 3/2, z°) {x° - x t ) 
8 rf 


(43) 


and the y derivatives are found similarly. The z derivatives 
are found numerically by forming the difference between two 
depths. The inversion is then done as before. 


The location of earthquakes for a spherical earth is similar. 
As before, we assume that velocity varies only with depth. In 
this case, for an earthquake at colatitude 0, longitude 0, focal 
depth z, and origin time t, we seek to estimate the model vector 
m = {6, 8, z, t) from the data. 

The travel time to receivers on the surface at colatitudes 0 i 
and longitude 0 t - depends on the focal depth and the angular 
distance from the epicenter (Eqn A.7.7), 

cos A - = cos 6 cos 6j + sin 0 sin 0- cos (0 f - (j )). (44) 

For a travel time curve T(A, z) the arrival times are 

d i = T(A i ,z) + t. (45) 

Several average global travel time curves are available, as in 
Fig. 3.5-4. In addition, a travel time curve for a specific velocity 
model can be found numerically by tracing rays. 

In this case, the 0 derivatives are found using 


ddj _ dT( A f -, z) = dT(Aj, z) dAj_ 
dd ~ dO ~ dA A de 

A- 

To find the last term, note that 
3(cosA-) d(cosA-) 3(A-) 
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(48) 


where ^ is the azimuth of the i th station with respect to the 
earthquake (Eqn. A.7.10). Thus the partial derivatives with 
respect to source colatitude are 


dd: 3T(A-, z) r 
30 3A 

Similarly, because by the same method 
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Fig. 7.2-5 Comparison of epicenters for earthquakes in central Idaho 
derived by a standard location program {PDE, open triangles) and from a 
joint epicenter determination study (JED, closed symbols). Error ellipses 
are shown for JED locations. The JED epicenters suggest a narrower 
source region than the PDE epicenters. (Dewey, 1987. © Seismological 
Society of America. All rights reserved.) 


the partial derivatives with respect to source longitude are 



d(p 


dT( A,-, z) 
dA 


sin 0 sin f •. 


(51) 


The two derivatives required from the travel time table, 3T(A., 
z)/d A and dT{ A-, z)/dz, can be approximated by forming differ¬ 
ences between tabulated values. This approach is used to locate 
earthquakes all over the world using teleseismic data, often 
from hundreds of stations. 

We can also locate earthquakes in a laterally varying struc¬ 
ture using a numerical representation of the travel time curve. 
In this case, the travel times, and hence partial derivatives, de¬ 
pend on the actual positions of the source and the receiver, not 
just on the distance between them. The techniques discussed so 
far will work, with the modification that the travel times, and 
hence partial derivatives, must be computed, by tracing rays or 
otherwise, for each source-receiver pair. The computational 
effort involved is large enough that laterally homogeneous 
models are used whenever possible. 

A number of methods are sometimes applied to improve 
locations derived using a laterally homogeneous model. Some 
treat residuals at individual stations as station corrections to be 
removed. Master event methods consider a particular (often 
the largest) earthquake in a group as the best located, and then 


locate a group of nearby earthquakes using a travel time correc¬ 
tion at each station derived from the residual at each station for 
the master event. This procedure attempts to locate the other 
events more accurately with respect to the master event. Joint 
hypocenter determination methods use data from a number of 
nearby earthquakes, and locate them simultaneously to best fit 
the travel times. Figure 7.2-5 illustrates applying this technique 
to a group of earthquakes: the locations from a joint epicenter 
determination study are more closely grouped and are shifted 
somewhat from the epicenters for the same events found by the 
standard location program. 

When considering earthquake location, the travel time 
residuals remaining once the “best” location is found are a 
nuisance. Following the dictum that “one person’s signal is 
another’s noise” brings us naturally to our next topic, the use 
of these travel time residuals to study deviations from a later¬ 
ally homogeneous earth model. 


7.3 Travel time tomography 

In the last section we noted that travel time observations con¬ 
tain information about both the location and the origin time 
of the seismic source and the velocity structure in the region 
between the source and receivers. Thus, for the simple halfspace 
example shown, we also inverted the travel time residuals to 
find the best velocity. This is analogous to the way in Chapter 3 
that we discussed techniques to develop layered models in 
which velocity varied only with depth. However, we have seen 
that many of the earth’s most interesting processes, such as 
subduction, cause deviations from a laterally homogeneous 
velocity model. Methods have thus been developed to use seis¬ 
mological data to investigate laterally heterogeneous structure. 
For example, we have discussed using lateral variations in 
surface wave velocities to investigate the cooling of oceanic 
lithosphere (Section 2.8.3) and migration of seismic reflection 
data to image variable structure at depth (Section 3.3.7). In this 
section we introduce the concepts of travel time tomography , 
some of whose results we have seen in Sections 3.7 and 5.4. 
This discussion illustrates both some further general aspects of 
inverse problems and some specific features of inverting for 
earth structure. 


7.3.1 Theory 

Consider the path s of a seismic ray through a medium whose 
velocity v varies with position. The travel time, T, is 


T = 


l/v(s)ds 


u(s) ds , 


( 1 ) 


the integral of 1/velocity, the slowness, along the ray path. The 
ray path, In turn, is determined by the velocity distribution. 
Suppose now that the slowness at various points along the path 
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Fig. 7.3-1 Geometry of a region being studied using travel time 
tomography. The region is divided into blocks /, whose perturbations in 
velocity are to be found from the travel time along ray paths i. The velocity 
outside the blocks is assumed to be laterally homogeneous, so travel time 
perturbations with respect to the reference model are used to find the 
velocity perturbations within the blocks. 


is perturbed by an amount 8u{s) small enough that the ray path 
is essentially unchanged, but the travel time changes by 

8T= 8u(s)ds. (2) 

We can then use the changes in travel time to study the velocity 
changes that caused them. 

Because the travel time perturbation reflects the slowness 
perturbation integrated along the ray path, a single observation 
does not indicate how the perturbation is distributed along 
the path. A large localized perturbation and a smaller, but more 
widely distributed, one could give the same effect. To improve 
resolution, data from ray paths that sample the medium differ¬ 
ently can be combined (Fig. 7.3-1). The simplest spatial dis¬ 
tribution of the slowness perturbation divides the medium into 
a number of homogeneous subregions termed blocks, or cells. 
Thus the integral (Eqn 2) giving the travel time perturbation 
along the z' th ray path is written in discrete form 

at ,= H G a Au i ’ ( 3 ) 

;=i 

where G { - is the distance the i th ray travels in the / th block, and 
A u- is the slowness perturbation in the block. 

Our goal is to use the observed travel times along a number 
of paths through the medium to recover the slowness perturba¬ 
tion. Problems of this type, in which observations of properties 
integrated along a number of paths through the medium are 
used to infer the two- or three-dimensional distribution of the 
physical property within a medium, occur in many branches 
of science and are known collectively as tomography} The 


two- or three-dimensional perturbation can be thought of as 
an image, which we seek to reconstruct from observations. 
The observations, one-dimensional integrals through the per¬ 
turbation, are known as projections. 

In travel time tomography, the inverse problem of estimating 
the slowness perturbation from the observed travel time per¬ 
turbation has the form discussed in the last section 

d = Gm, or d^^G^m-. (4) 

i 

As before, we do not explicitly write the As, so the model vector 
m is the perturbation in slowness from a starting model, and 
the data vector d is the difference between the observed travel 
times and those predicted by the starting model. The elements 
of the partial derivative matrix 


dm- diij 

equal the distance the i th ray travels in the ; th block, which is the 
partial derivative of the ray’s travel time with respect to the 
slowness in the block. 

The matrix G is an operator that relates model vectors and 
data vectors. As in the location problem, these vectors are 
physically different quantities with different dimensions. The 
model vectors have as many elements as there are blocks in the 
model, whereas the data vectors have a number of elements 
equal to the number of ray paths. Mathematically, this means 
that if there are r blocks in the model, any model vector is a 
vector in an f-dimensional model space. Similarly, if there are n 
travel times and thus n ray paths, any data vector is a vector 
in an ^-dimensional data space. Because there are generally 
many more equations (ray paths) than unknowns (model para¬ 
meters), the system of equations is over determined. Because 
the data contain noise, the system of equations is generally also 
inconsistent. 

The inverse problem is solved by a procedure like that dis¬ 
cussed for the location problem. For the different ray paths, 
the travel times and the distances traveled in each block are pre¬ 
dicted using a starting or reference model. The starting model is 
generally laterally homogeneous, so the travel times are easily 
calculated. Travel time residuals are then computed for each 
ray path by subtracting the times predicted by the starting 
model from those observed. These travel time residuals form 
the data vector that is inverted using the generalized inverse to 
find slowness changes that predict the travel time residuals as 
well as possible. 

To illustrate these ideas, consider a schematic experiment in 
which a region under a seismic array is divided into four square 
blocks of unit length (Fig. 7.3-2). Travel time residuals from 
six ray paths form the data. Four paths (1-4), which can be 
thought of as due to distant (teleseismic) earthquakes, traverse 
the model vertically. Two paths (5,6), which can be thought of 
as due to local earthquakes, traverse the model horizontally. 


This term is Greek for “slice picture.” 
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Fig. 7.3-2 Ray path and block geometry for an idealized 
tomographic experiment. Each block is sampled by three 
different ray paths. 


The reference slowness model is assumed to be appropriate 
outside the blocks, so the entire travel time residual for each 
path is attributed to slowness perturbations in the blocks. Thus 
the problem looks like 
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We encountered this problem, solving a vector-matrix equa¬ 
tion where the matrix is not square, in the last section. As in 
that case, we form 


G T Gm=G T d 


(7) 


and invert the square matrix G r G to form the generalized 
inverse solution 
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( 10 ) 


After multiplying by G T , we attempt to solve this system as 
before, but find that the matrix G T G has a zero determinant, so 
it cannot be inverted. Thus, although the system of equations 
(7) has four equations for four unknowns, it does not have a 
unique solution (Section A.4.4). It turns out that this is because 
the rows of G are not linearly independent. Thus the ray 
geometry is not adequate to fully resolve the slowness pertur¬ 
bations in the four blocks. 

Because this situation occurs frequently in solving inverse 
problems, methods for dealing with it have been developed. 
Although a full treatment is beyond our scope, we summarize 
some key ideas without proof. 

In the general case when G is an n x r matrix, G r G is an r x r 
symmetric matrix that can be decomposed using its eigen¬ 
vectors and eigenvalues (Section A.5.3) 


m^=(G T G)“ 1 G T d=G^d. (8) 

We next ask how m^, the model found by the inversion, 
compares to the actual slowness model that gave rise to 
the travel time data. To compare the two, we substitute Gm 
for d in Eqn 8, and find that in this case 

m ? = (G T G)“ 1 G T Gm = m, (9) 

so the inversion correctly resolves the true model. Naturally, 
if errors are present in the data, these errors propagate into the 
results of the inversion, as discussed previously. 

7.3.2 Generalized inverse 

An interesting situation occurs in this example if only the four 
teleseismic ray paths (1-4) are available. The inverse problem 
becomes finding the four elements of m from 


G T G = V AV T , (11) 

where the columns of matrix V are the r eigenvectors of G T G 
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and A is a diagonal matrix with eigenvalues on the diagonal 
and zeroes elsewhere 
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Because the eigenvectors are orthogonal, 
VV T =V T V=I , so V T =V-\ 

If G t G has an inverse, 

(G t G) _1 = (VAV 7 )~ 1 = VA” 1 V T , 


that involves only the eigenvectors with nonzero eigenvalues 
gives an optimal solution to the inverse problem. This solution 
provides the best fit to the data while minimizing m, the change 
from the starting model. This is a desirable feature: for ex¬ 
ample, in the tomographic problem, we start with a laterally 
homogeneous model, so the best solution is that with least 
lateral velocity variation consistent with the data. 


where 


VX 1 0 
0 1/T 


0 0 


This expression shows that G T G is singular if at least one 
eigenvalue is zero. In this case, the p nonzero eigenvalues are 
used to form the p x p diagonal matrix 


Tj 0 


and the associated eigenvectors are divided into two matrices: 


and Va 


. (18) 


V p is the f x p matrix of the eigenvectors with nonzero eigen¬ 
values, and V 0 is the r x (r - p) matrix of the eigenvectors with 
zero eigenvalues. 

Similarly, the nxn matrix GG T can be decomposed as 
GG t =UAU t , (19) 


7.3.3 Properties of the generalized inverse solution 

The relation between the solution to the inverse problem, the 
model derived from the data using 


and the “true” (although unknown) model m, can be found 
because the data are related to the “true” model by the forward 
problem (Eqn 4), so 

m p = G“PGm = V p A- 1 uru p A p V>= V p V^m. (23) 

Thus the matrix G~ P G = V p V 7 is known as the model resolution 
matrix. 

The derivation used the fact that U^U p = I, because the col¬ 
umns of U p and hence the rows of U 7 are orthonormal 
eigenvectors. Similarly, V 7 V p = I. By contrast, if there are 
some zero eigenvalues, then p ^ n, U p Uj, ^ I and p r, V p V p ^ I, 
because the rows of U p and V p are no longer orthonormal 
eigenvectors (because the columns corresponding to the zero 
eigenvalues have been removed to form the V 0 and U 0 matrices). 

To illustrate these ideas, consider the example in Eqn 10. The 
G matrix yields 


3 0 12 

G T G= ; H l , (24) 

U 1 0 3j 

which has eigenvalues 0, 2, 4, 6, and hence is singular. The 
eigenvector matrices are 


using its eigenvector matrix U. GG 7 has the same p nonzero 
eigenvalues as G T G, so the U matrix can be divided into U p9 
the nxp matrix of the eigenvectors with nonzero eigenvalues, 
and U 0 , the n x (n - p) matrix of the eigenvectors with zero 
eigenvalues. Although we do not prove it here, it is possible 
to decompose the matrix G using only the eigenvectors with 
nonzero eigenvalues: 

G = UAV 7 =U p A p V 7 (20) 

This decomposition, known as the Lanczos decomposition , is 
important, because a generalized inverse 

<r» = V A?UJ ( 21 ) 
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= -0.5 0.5 0.5 and V ° - -0.5 • (25) 

v 0.5 -0.5 0.5J [-0.5, 

The model resulting from the inversion m p is then related to the 
“true” (although unknown) model m by the model resolution 
matrix 


l = V V 7 m 
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V matrices their form, controls the resolution. Note that in the 
first example, in which all six ray paths are used, Eqn 9 shows 
that the model from the inversion was the true model. In this 
case the resolution matrix is the identity matrix. 

To see how the lack of resolution in the four-ray case arises, 
consider what would occur if G r G had no zero eigenvalues and 
could be inverted. Then, by Eqns 21 and 22, the model derived 
from the data would be 

m p= VA J u l<l, (27) 

because V p = V. The model is thus a linear combination of 
the columns of V, or the eigenvectors of G T G. Because there 
are r (in this case four) linearly independent eigenvectors, and 
the model vector has r elements, the eigenvectors span the 
r-dimensional model space. Thus any vector in the model space 
is a possible model. 

If instead, as in this case, some of the eigenvalues are zero, 
the eigenvectors associated with them are excluded from the V p 
matrix. The model 

( 28 ) 


Fig. 7.3-3 Illustration of the “blurring” resulting from the tomographic 
experiment of Fig. 7.3-2, with incomplete ray coverage. When coverage is 
adequate, the true slowness perturbation (top left) is recovered (top right). 
When coverage is inadequate, the true slowness perturbation (lower left) is 
blurred {lower right), although the resulting slowness perturbations yield 
the correct travel time perturbation for each ray path. 

The z th column of the model resolution matrix shows how a 
unit perturbation in the z th element of the true model maps into 
various elements of m p . The true model is thus “blurred” by the 
inversion. For example (Fig. 7.3-3), inversion of travel time 
data resulting from a 1% slowness perturbation in block 3 
yields a model with 0.25% perturbations in blocks 1 and 2, a 
0.75% perturbation in block 3, and a -0.25% perturbation in 
block 4. These slowness perturbations yield the correct travel 
time perturbations for the four paths, but because there are no 
horizontal paths, the solution is not exactly correct. However, 
most of the perturbation is correctly placed. Note that the 
resolved structure has a smaller net slowness perturbation than 
the true structure. 

The relation between the resolution matrix and the model 
covariance matrix (Eqn 7.2.32) is interesting. The blurring 
illustrated by the resolution matrix results from the ray geo¬ 
metry and would occur even if the data contained no errors. 
In other words, the resolution matrix illustrates how well the 
inverse problem could be solved for perfect data. Because the 
data usually contain errors, the uncertainty in the model, given 
by the model covariance matrix, reflects errors induced in the 
model by both the ray geometry and the data errors. 

Because the resolution matrix shows how a perturbation in 
any block is resolved by the inversion, it can be used to find 
how well the inversion can recover an arbitrary slowness 
anomaly. Thus the ray geometry, which gives the G and hence 


is then a linear combination of only the columns of V p , the 
eigenvectors associated with the nonzero eigenvalues. In this 
case, there are r-p (here three) rather than r linearly independ¬ 
ent eigenvectors. Hence not all possible vectors in the model 
space can be constructed. The model resulting from the inver¬ 
sion contains no linear combinations of the eigenvectors asso¬ 
ciated with the zero eigenvalues. 

To illustrate this idea, consider the four-ray case where the 
eigenvector associated with the zero eigenvalue is (from Eqn 25) 

v = (0.5, 0.5,-0.5,-0.5) t . (29) 

This vector corresponds to equal slowness perturbations in 
blocks 1 and 2 and equal perturbations of opposite sign in 
blocks 3 and 4. Physically, this means changing the slowness 
everywhere in the upper layer by some amount, and making the 
opposite change in the lower layer. Because all four teleseismic 
rays have equal path lengths in the upper and lower layers, 
their travel times are unaffected, so travel time data cannot 
resolve any such change. 

Another way to see this is to consider Eqn 7 and note that if 
v is an eigenvector whose eigenvalue is zero, 

(G T G)v = 0, (30) 

so that even if the model contains a linear combination of such 
eigenvectors, they have no effect on the problem. The zero 
eigenvectors thus limit the resolution of the model. Because any 
linear combination of these eigenvectors has no effect, the 
model resulting from the inversion is not unique. It is possible 
to prove that the generalized inverse G~ p finds a “best” model 
with no contribution from these eigenvectors. Mathematically, 
the resulting model is restricted to the V p space and has no com¬ 
ponents in the V 0 space. As a result, this model is the minimum 
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possible solution consistent with the data. In this application, 
the minimum model gives the least lateral perturbation in 
slowness consistent with the travel time data. Philosophically, 
this is an attractive approach. 

The six-ray case, by contrast, had no zero eigenvalues. 
Because one ray traveled only in the upper layer and another 
traveled only in the lower layer, a change in the slowness in 
either layer would affect the travel times. This ray geometry 
avoids the ambiguity of the four-ray case, so the model is fully 
resolved. There is no V 0 space, so V = V p , G T G can be inverted, 
and the solution is found using the generalized inverse G~~ 8 
(Eqn 8). To see how this is related to the generalized inverse 
G ~ p , we use the Lanczos decomposition (Eqn 20) to expand G: 

G T G = (VA p U T p )(U p A p V T ) = VA 2 p V\ (31) 

(G t G) _1 = VA~ 2 V T , (32) 

where the matrix products A 2 = A p A p and A“ 2 = A^A" 1 . Thus, 
if G t G can be inverted, the generalized inverse 
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G- 8 =(G T G)- 1 G T =(VAfV T )(VA p Ul) = VA~ 1 U T p = G~P. (33) 

Hence G~ p is the general form of the generalized inverse, and 
G~ 8 is the special form that applies if G T G can be inverted. The 
later form, G~ 8 , is easier to compute because it does not require 
the eigenvector decomposition. Fortunately, it can often be 
used in applications such as earthquake location. 

The eigenvector decomposition also divides the data space 
into two portions, U p and U 0 , reflecting the nonzero and zero 
eigenvalues. Data vectors in the U 0 space, linear combinations 
of the eigenvectors whose eigenvalues are zero, cannot be gen¬ 
erated by the operator G for any model. For example, in the 
six-ray case there cannot be six linearly independent observa¬ 
tions because the model has only four parameters. Thus two of 
the six eigenvectors of the 6 x 6 matrix GG T must have zero 
eigenvalues. These eigenvectors represent travel time observa¬ 
tions that should be impossible, given the geometry of the 
experiment. If the data contained some linear combinations 
of these eigenvectors, perhaps due to noise in the data, the 
inversion process could never generate a model capable of 
matching them. 

Figure 7.3-4 summarizes these ideas: the operator G and 
its generalized inverse G~ p relate the model and data spaces. 
Portions of these spaces are not “illuminated.” Any part of the 
model in the V 0 portion of the model space has no effect on 
the data, and thus cannot be detected. Thus, if V 0 space exists, 
the model found by solving the inverse problem is not unique. 
This situation can only be improved by additional types of 
data, such as a new set of ray paths in the tomographic example 
(Fig. 7.3-3). 2 Similarly, any part of the data in the U 0 portion 
of the data space cannot be described by any possible model. 

2 As Sherlock Holmes says in The Copper Beeches , “I have devised seven separate 
explanations, each of which would cover the facts so far as we know them. But which 
of these is correct can only be determined by fresh information.” 


Fig. 7.3-4 Schematic illustration of the relation between the model and 
data spaces for the inverse problem d = Gm. The observed data d form a 
vector in the w-dimensional data space, the model m sought is a vector in 
the r-dimensional model space, and the known partial derivative matrix G 
has dimensions n x r. Matrix U, whose columns are the eigenvectors of the 
matrix GG T , can be decomposed into U p , the matrix of the p eigenvectors 
with nonzero eigenvalues X p X 2 , ■ ■ ■, X p , and U 0 , the matrix of the 
eigenvectors with zero eigenvalues. Similarly, the matrix V, whose 
columns are the eigenvectors of the matrix G T G, can be decomposed into 
V p , the matrix of the eigenvectors with nonzero eigenvalues, and V 0 , the 
matrix of the eigenvectors with zero eigenvalues. (After Lanczos, 1961.) 

Thus, if a U 0 space exists, the model found by solving the 
inverse problem is not an exact solution. 

73 A Variants of the solution 

A number of variants of the least squares solution that we have 
developed using earthquake location and tomography are also 
used in these and other inverse problems. 

One variant arises from the fact that although the eigen¬ 
vector decomposition gives insights, it may not be the best 
approach in some real applications. First, it involves significant 
computations when the matrices are large. Second, it associates 
difficulties with the eigenvalues that are zero, whereas in real 
problems complications and noisy data are more likely to yield 
small, but nonzero, eigenvalues. These small eigenvalues cause 
the sort of difficulties that occur formally for zero eigenvalues. 
To see this, note that in Eqn 27 the model is derived by 
multiplying the data by the matrix A -1 , which contains the 
reciprocals of the eigenvalues. Thus the small eigenvalues, rep¬ 
resenting the worst-constrained features of the data and model 
spaces, can have large effects on the solution. For example, 
we noted in Section 4.4.7 that using the generalized inverse 
to estimate the moment tensor gives good estimates of 
components on which seismograms depend strongly, but 
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poorer ones for components on which the seismogram depends 
weakly. 

This issue can be addressed in several ways. One is to exclude 
small eigenvalues from the inversion. Another, which avoids 
the eigenvector decomposition, is to modify the function used 
to measure the misfit between the data predicted by the model 
and those observed (Eqn 7.2.11) to 


* 2 = I — H - ZG^Aim. + e 2 Z(Am,) 2 . 


This function is the sum of the net misfit and the change in 
length of the model vector, weighted by £ 2 . Hence minimizing 
it is a compromise between the best fit to the data and the least 
change from the starting model. The resulting solution, written 
with the As suppressed, 

m = ( G t G + e 2 I)~ 1 G T d, ( 35 ) 

is called the damped least squares solution. If £ is zero, we have 
the best-fit solution (Eqn 7.2.17), whereas larger values of £ 
reduce or damp the change in the starting model by accepting 
a poorer fit to the data. The damping parameter £ is chosen 
empirically to yield a solution that seems plausible, and thus of 
necessity reflects our ideas about the solution sought, because 
damping the poorly constrained and undesired changes in the 
model also damps the better constrained and desired changes. 

Another common situation is that we want some data to have 
greater effect on the solution, usually because we consider them 
to be better known. We thus incorporate a data-weighting 
matrix W d into the solution. The simplest is to weight by W d = 
(°d) -1 > ^ inverse of the variance-covariance matrix of the 
data, so the data with the smallest uncertainties have the great¬ 
est effect. Problem 5 shows that this weighted least squares 
solution is 

m = (G T W^G) _1 G T W^d. (36) 

We may also want to have the model change smoothly, 
such that each element varies only slightly with respect to its 
neighbors. For instance, if the model were a continuous func¬ 
tion of one variable, we measure the smoothness, or flatness , 
f, of the changes by forming 


-1 1 0 


0 0 0 


0 m 1 
0 m 2 
0 m 3 
0 • 


y 0 000 -1 ij \m r J 

where F is the flatness matrix , which is a numerical approxima¬ 
tion to the derivative at the edges of each element. The overall 
flatness of the solution is then 


f T f = m T F T Fm = m T W w m, ( 3 g j 

so the matrix W m = F T F is a weighting matrix for the model. For 
more complicated model geometries, F is changed appropriately. 

We can combine the model and data weighting in a weighted 
damped least squares inversion, which yields the solution 

m = (G T W d G + £ 2 W m )~ l G T W d d. (39) 

As noted earlier, the damping parameter £ is chosen empiric¬ 
ally. If we do not weight the data and model, the weighting 
matrices W m and W d are identity matrices, and Eqn 39 is just 
the simple damped least squares solution (Eqn 35). 

An example of such an inversion was shown for P-wave 
velocities at the base of the mantle in Fig. 3.5-17. A grid of 660 
nodes that were roughly equally spaced were used to represent 
the base of the mantle. The damping factor, £=1.2, was a com¬ 
promise between the best fit, which minimizes the prediction 
error, and minimizing the undetermined part of the solution. 
Because each node is surrounded by 5 or 6 nodes that are 
roughly equidistant, the rows of the model flatness matrix F 
were chosen with the diagonal term equal to -1 and the terms 
of the nearest N neighbors equal to 1/N (with N = 5 or 6). The 
data were weighted empirically so that the diagonal elements 
of the W d matrix ranked the quality of the observations from 9 
(excellent) through 4 (good) to 1 (poor). These choices again 
bear out that we have various ways of solving inverse prob¬ 
lems, so the solution we develop depends on choices about 
the data we use and the model we seek, based on our ideas 
about what seems reasonable. Hence our solutions are in part 
objective and in part subjective, and different approaches yield 
different solutions. 


7.3.5 Examples 

Studies using travel time tomography yield interesting results 
for various areas. For example, Fig. 7.3-5 {top) shows the model 
geometry used in a study of the upper mantle in the region 
including Central Europe, the Mediterranean, and the Middle 
East. The model contains nine layers, each divided into 1040 1° 
by 1° blocks. The layer thickness increases with depth from 
3 3 km at the top to 13 0 km at a depth of 670 km. The data con¬ 
sist of approximately half a million travel times from about 
25,000 earthquakes, recorded at stations both within the 
model region and at distances to 90°. 

The data used are travel time anomalies relative to the 
Jeffreys-Bullen values, which can result from earthquake 
mislocations as well as variations in seismic velocities. The 
location and origin time of the earthquakes were thus also 
inverted for, so the number of unknowns reflects both the 
number of blocks (9360) and four times the number of earth¬ 
quakes used. To reduce these numbers, procedures were used 
to combine data from nearby earthquakes and from stations 
close to each other. The problem to be solved thus involves 
approximately 300,000 equations for 20,000 unknowns. 
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Fig. 7.3-5 Top : Block model for a travel time tomographic study of the 
upper mantle in the region including Central Europe, the Mediterranean, 
and the Middle East. The heavy line indicates the location of the cross- 
section shown below. (Spakman and Nolet, 1988, with kind permission 
from Kluwer Academic Publishers.) Bottom-. Cross-section through the 
block model across the Hellenic trench region, showing P-wave velocity 
perturbations with respect to the JB model. (Spakman et al ., 198 8. 
Geophys. Res. Lett., 15 ,60-3, copyright by the American Geophysical 
Union.) 

Solving matrix equations of this size poses major difficulties. 
The matrices are so large {in this case 6 x 10 9 elements) that 
they are difficult to store in a computer and operate on. As a 
result, numerical methods are used, some of which allow only 
a single row of the matrix to be manipulated at any time. The 
properties of these algorithms and methods of improving the 
resulting image form an active research area. 

The resulting three-dimensional velocity model can be shown 
as either cross-sections or map views at various depths. Fig¬ 
ure 7.3-5 ( bottom ) shows a cross-section across the Hellenic 
trench region, where the African plate subducts beneath Crete 
and the Aegean basin (Fig. 5.6-8). The tomographic image 
shows velocity anomalies in percent of the velocity predicted 
for that depth by the JB model. A planar high-velocity (posit¬ 
ive) anomaly, presumably the cold downgoing slab, dips NW 
from the trench and extends to depths well below the deepest 
earthquakes (dots). Above the slab, a low-velocity (negative) 
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Fig. 7.3-6 Analysis of the tomographic image in Fig. 7.3-5 {bottom). Top: 
Hit count plot, showing the number of times each block is sampled. Black 
regions indicate the best-sampled blocks (hit counts in excess of 2000). 
Bottom : Resolution test using synthetic velocity anomalies. Travel times 
are generated for a model with 5% velocity perturbations, of alternating 
sign, in each of the blocks marked by heavy lines. How well the 
perturbations are recovered illustrates how much the image is blurred. 
(Spakman and Nolet, 1988, with kind permission from Kluwer Academic 
Publishers.) 

region occurs, presumably due to flow behind the arc. Such 
observations are valuable for modeling the subduction history 
and dynamics. 

Because tomographic images are solutions to an inverse 
problem, they are neither unique nor exact. Hence it is import¬ 
ant to assess which features in the image are likely to be geo¬ 
logically real, and which are more likely to be artifacts of the 
inversion. As we have seen, an important factor is how well 
parts of the model are sampled by the ray paths. Figure 7.3-6 
(top) shows a hit count plot for the section of Fig. 7.3-5 
(bottom), showing the number of ray paths that sample each 
block. The better-illuminated regions should be better resolved 
than poorly sampled regions. Additional insight comes from 
analyzing how a perturbation in one model block is blurred by 
the inversion into nearby blocks. This information, given by 
the resolution matrix (Eqn 23), can also be found by placing a 
perturbation in one block, computing the forward problem, 
and inverting the result. Because this would be time consuming 
for such a large model, perturbations were placed in various 
blocks, and the combined resolution was estimated by com¬ 
puting synthetic travel time data and inverting it. Figure 7.3-6 
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Fig. 7.3-7 Illustration of the effects of the 
reference model in travel time tomography. 
Velocity structure (a) and ray paths (b) for ' 
global reference models JB and PREM 
and a local reference model VCAR. The 
differences (c, d) between the tomographic 
images reflect differences between the 
reference models near 600 km depth. 

(van der Hilst and Spakman, 1989. Geopbs. 
Res. Lett., 16,1093-6, copyright by the 
American Geophysical Union.) 


(bottom ) illustrates this method for a 5% velocity contrast 
whose sign alternates between columns. If resolution were per¬ 
fect, the image would be reconstructed exactly: each anomaly 
would be confined to the original block (heavy line). Due to the 
ray geometry, the anomalies “blur”, but are still concentrated 
in the correct locations. Comparison with the hit counts shows 
that better-sampled regions, such as the second column from 
the left, are better resolved than poorly sampled regions like 
the lower left column. The reconstructed image is further de¬ 
graded when the effects of noise in the data are simulated. Even 
in this case, the inversion results locate the perturbed blocks 
reasonably well and retrieve the sign of the perturbation. 
These tests suggest strongly that the high-velocity slab in the 
image is real. 

Typically, the major features of tomographic inversions seem 
likely to be real, but assessing how much of the detailed struc¬ 
ture is real is more difficult. For example, Fig. 5.4-7 showed 
the results of a numerical experiment to see how well a tomo¬ 
graphic study would reconstruct the image of a theoretical 
subducting slab. It turned out that the general shape of the slab 
was resolved, but was blurred by artifacts implying velocity 
anomalies that are not present in the original model. In this 
case these artifacts, generally of low amplitude, caused the slab 
to appear to broaden, shallow in dip, or flatten out. The extent 
to which these artifacts appear depends on ray geometry, so 
the image could be improved by using upgoing as well as 
downgoing rays. 

Another important factor in tomographic images is the refer¬ 
ence model with respect to which the velocity anomalies are 
shown. In examining images, it is natural to focus on the lateral 


variations. However, because these variations are with respect 
to a starting model, which is usually laterally homogeneous, 
the resulting images depend on the starting model. Figure 7 3-7 
Shows an example for the Lesser Antilles. The ray paths pre¬ 
dicted by the global JB and PREM reference models differ 
somewhat from those predicted by a model VCAR developed 
or this region. As a result, tomographic images relative to the 
JB and VCAR models differ. Although both show the high- 
velocity North American plate subducting westward beneath 
the Caribbean, the JB image implies that the slab flattens at the 
660 km discontinuity, whereas this suggestion is much less in 
the VCAR image. The flattening in the JB image results from 
the fact that the inversion yields “streaks” of velocities relative 
to JB that are lower than those observed above 660 km, and 
higher than those observed below 660 km. This effect arises 
because, compared to VCAR, the JB model predicts higher 
velocity above 660 km, and lower velocity below. Thus a bias 
in the reference model can produce spurious lateral heterogene¬ 
ity. Similar reference model artifacts, in which a common state 
seems abnormal due to the standard used, appear in various 
inverse problems and other situations. 3 However, the choice 
of reference model is subjective, so making a choice requires 
recognizing its consequences. For example, a global velocity 
reference model that excluded subducting slabs would be 
slower than the actual global average, whereas one including 
slabs would predict slow anomalies elsewhere. 
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Fig. 7.3-8 An example of cross-borehole tomography in Manitoba, Canada. Left: Travel times are recorded from a source at different depths in one 
borehole to receivers in the other. The experiment is then reversed, yielding dense ray path coverage. Center: Straight ray paths computed for the laterally 
homogeneous starting model. Right: Ray paths for the laterally varying model found from the inversion. (Wong etaL, 1987.) 


In addition to ray geometry and reference model artifacts, it 
is worth noting that tomographic images can also be affected 
by something as simple as the contouring scheme used. Some¬ 
times when features are not robust aspects of the image, their 
tectonic interpretation depends in part on preconceptions, 
much like the ink-blot tests used by psychologists. Thus, despite 
the power and value of tomographic images, it is important to 
bear their limitations in mind. 

Tomography is also used in other seismological applications. 
One important use, providing detailed near-surface images, 
is illustrated by Fig. 7.3-8 showing tomography between two 
boreholes. The source and receivers were moved to generate 
dense coverage with many crossing ray paths. The travel time 
observations were then inverted for velocity structure. In this 
experiment, the ray paths were recomputed for the perturbed 
model and used to compute travel times for later iterations. 
The differences between the initial and perturbed ray paths 
show the advantages of recomputing the ray paths for each 
successive model, a process called nonlinear tomography. This 
updating ensures that the ray paths, and hence predicted travel 
time anomalies, are consistent with the velocity structure being 
found. Flowever, for practical reasons it is common to conduct 
linearized tomography using ray paths from the starting model 
even as the model is perturbed, and to assume that the resulting 
errors are small. 

It is interesting to compare travel time tomography to the 
surface wave tomography discussed in Section 2.8.3, where the 
average surface wave velocity along multiple paths through 
oceanic lithosphere of various ages is used to infer the velocity 
structure for each age range. The approach is to find the phase 
or group wave velocity as a function of frequency for each 


age range, and then infer the variation in the medium velocity 
with depth from the dispersion curve giving the variation in 
apparent velocity as a function of frequency. Hence this is 
tomography in the lateral direction, and dispersion analysis 
vertically. We will see in the next section that dispersion ana¬ 
lysis is an example of methods that infer earth structure using 
functions that sample the structure at depth in different ways. 

Tomographic methods can be used for waveforms as well as 
travel times. As noted earlier — for example, in Fig. 3.7-7 — 
waveforms sample earth structure over broader regions than 
travel times, which, in the limit, correspond to sampling along 
narrow geometric rays. Figure 7.3-9 shows some results 
from global tomography in which velocity perturbations were 
inferred by fitting both waveforms from 27,000 long-period 
seismograms and 14,000 travel times. The seismograms include 
body wave records (from the P or PKP arrival to the start of the 
surface waves) and “mantle wave” records, which are low-pass 
filtered seismograms about 4.5 hours in length. The travel time 
data include both absolute shear wave arrival times and dif¬ 
ferential {SS-S and ScS-S) times. Rather than inverting for the 
velocity perturbations in blocks, the velocity perturbation was 
described by a series of orthogonal functions, and the inversion 
was for the coefficients of the functions. The lateral structure 
was described by spherical harmonics (Section 2.9.3), and the 
vertical structure was modeled using Chebyshev polynomials. 

In addition, we saw in Section 3.7 that amplitude tomo¬ 
graphy can infer attenuation variations along the ray paths. 
Amplitude tomography is similar to medical tomography, 4 
in which the image indicates the degree to which X-rays are 

4 The medical term “CAT scan” is for computed axial tomography. 
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Fig. 7.3-9 Tomographic image of shear wave velocities along a great circle 
slice through the Equator, obtained by inversion of both waveforms and 
travel times. (Su etal, 1994./. Geophys. Res., 99, 6945-80, copyright by 
the American Geophysical Union.) 


absorbed in different portions of the subject. Medical tomo¬ 
graphy has the advantages that the subject can be uniformly 
illuminated from all sides, and that the internal structure is 
both well understood and subject to later, direct observation. 


7.4 Stratified earth structure 

Quantities that can be determined using seismological data are 
often the integrals of a physical property of the earth. For 
example, the travel time is the integral of slowness along a ray 
path. As discussed in the last section, although a single travel 
time gives only the average slowness along the ray path, travel 
times for different ray paths can be combined to find the spatial 
distribution of slowness. 

A common such problem is finding earth structure for later¬ 
ally homogeneous or stratified earth models, in which physical 
properties are assumed to vary only with depth. Frequently, an 
observable quantity d- can be expressed as the integral over the 
radius of a physical property m(r). 



Fig. 7.4-1 Schematic amplitude spectrum of a seismogram, showing the 
observations used to invert normal mode data for eath structure. Each 
mode peak is described by a width proportional to Q 7 1 , which describes 
its attenuation, and an eigenfrequency ®.. 


where G-(r) is a known function of depth called a kernel. Given 
a set of d i with different kernels, each of which samples the 
distribution of m(r) differently, the inverse problem is to infer 
m(r). Although the relation between the observed quantity and 
earth structure is sometimes less intuitive than for travel time 
and slowness, the problems can be formulated in a similar way. 

We encountered this idea in discussing Love wave dispersion 
in Section 2.7.4. The apparent phase velocity along the free 
surface varies as a function of period, because waves of differ¬ 
ent period sample the velocity at depth differently. Flence this 
variation can be used to study the velocity at depth. 

7.4 .1 Earth structure from normal modes 

The concepts of inverting observations for the structure of 
a stratified medium can be illustrated using normal modes 
(Sections 2.9 and 3.7). The displacement field of the mode 
excited by an earthquake can be written 

U;(*) = C f (*) exp H» f f/2Q f ). (2) 

The mode’s eigenfrequency CD- and quality factor which 
describes the attenuation, and thus the width of the peak, 
can be found from the Fourier transform of the seismogram 
(Fig. 7.4-1). Because a> i and Q i depend on the variation with 
depth of the seismic velocities, density, and attenuation, these 
observations can be used to study earth structure. 

To do this, we begin with an earth model described by a(r), 
P{r), and p(r) and find the eigenfrequencies of the different 
modes, fy. This calculation also gives the partial derivative 
functions 





( 3 ) 


d t = 


a 
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Gj(r)m(r)dr, 


showing how a mode’s eigenfrequency changes if the velocity 
or density at a given depth is perturbed. The total change in the 
eigenfrequency is the integral over the radius of the perturba¬ 
tions in the earth model: 
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Period (s) 

Fig. 7.4-2 Observed attenuation for fundamental spheroidal modes 
0 S 2 - 0 S 191 . The variation in Q" 1 with period reflects the depth variation of 
q~ l {r). (Stein etai, 1981. Anelasticity in the Earth, 39-53, copyright by 
the American Geophysical Union.} 

a 

Ao,.= ^-(j-)Aa(r) + ^-(r)A^(r) + ^(r)Ap(r) dr. (4) 

J I 3a dp dp 

0 

Thus the difference between a measured eigenfrequency and 
that predicted by an earth model can be inverted to find the 
perturbation in the model required to fit the data. Although a 
single mode observation gives only the average over depth of 
the required perturbation, a set of modes gives more informa¬ 
tion, because the partial derivatives vary between modes. 

We illustrate the method using the corresponding inverse 
problem for attenuation, which has a simple linear form. If 


attenuation within the earth is described by the function q{r ), 
the quality factor for the z th mode is 

a 

Qf= G^q-^dr, (5) 

- 

o 

where the kernels G { {r) are derived from the partial derivatives 
(Eqn 4), using the formulation of the quality factor as an imagin¬ 
ary part of the frequency that is related to an imaginary part of 
the velocity (Section 3.7.6). Although the symbol Q is com¬ 
monly used for both the modes’ quality factor and the attenu¬ 
ation as a function of depth, using q(r) for the latter emphasizes 
the distinction. The problem is written using the reciprocals 
q~ x {r) and Qj 1 , so higher attenuation (larger loss of seismic 
energy) corresponds to larger values. 

Figure 7.4-2 shows measured values of the attenuation of 
fundamental spheroidal modes, which for periods less than a 
few hundred seconds correspond to fundamental mode Rayleigh 
waves. The attenuation is low for the longest-period modes, 
rises to its highest values at periods slightly above 100 seconds, 
and then decreases again for the shortest periods (about 50 sec¬ 
onds) shown. This variation occurs because the kernels differ 
between modes (Fig. 7.4-3). Because Q" 1 for a mode is the 
integral of the attenuation weighted by the kernel, the shape of 
the kernel with depth illustrates a mode’s sensitivity to attenu¬ 
ation at various depths. Long-period modes are most sensitive 
in the lower mantle, periods near 100 seconds sample the low- 
velocity zone heavily, and periods near 50 seconds are most 
sensitive to structure in the “lid” region above the low-velocity 




Fig. 7.4-3 Attenuation kernels for various 
modes, illustrating the different depth 
sampling. Attenuation values are for the 
third model in Fig. 7.4-5. (Stein etai, 1981. 
Anelasticity in the Earth, 39-53, copyright 
by the American Geophysical Union.) 
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Fig. 7,4-4 Schematic illustration of the model parameterizations for two 
types of inversion methods. In parameter space inversions, the model is 
divided into layers; in data space inversions the model is treated as a 
weighted sum of the kernels. 


zone. is a smooth function of the period, because the 
kernels of fundamental modes with similar periods are similar. 

The inverse problem is to use the observed mode attenuation 
Qi -1 and the known kernels Gfr) to infer the function q _1 (r) 
describing the variation of attenuation with depth in the earth 
that best fits the data. This problem can be approached in 
several ways, two of which we discuss briefly. 

7.4.2 Parameter and data space inversions 

The most direct approach, parameter space inversion , is to 
regard the unknown model q~ l {r) as constant in a set of layers 
(Fig. 7.4-4, left), such that in the j th layer 


before, the generalized inverse gives the “best” solution in a 
least squares sense. The concepts developed previously are use¬ 
ful for assessing the solution. Columns of the resolution 
matrix, called resolving kernels, indicate how well the value in 
the corresponding layer could be determined independently of 
those in the other layers if the data had no errors. This uncer¬ 
tainty results from the inverse problem itself, and reflects the 
best resolution possible, given the available kernels, analogous 
to the resolution matrix (Eqn 7.3.23) in the tomographic exam¬ 
ple. It is also useful to consider the model covariance matrix, 
which indicates the uncertainty in the model due to both the 
nature of the inverse problem and the errors in the observa¬ 
tions. Often a weighted average over a number of layers is the 
best resolution obtainable, analogous to the blurring in travel 
time tomography. 

Parameter space inversion has a few unattractive features. 
First, the layers in which attenuation is treated as constant must 
be chosen in advance. This choice might not be a meaningful 
one. Second, parametrizing the model as constant in these 
layers yields a model with “steps” at layer boundaries. These 
steps may be quite unphysical; in many cases our intuition 
(admittedly sometimes a poor guide) suggests that physical 
properties should vary smoothly with depth. 

In an alternative formulation, data space inversion, the 
unknown model describing attenuation as a function of depth 
is expanded not into constant layers, but as a weighted sum of 
the kernels themselves (Fig. 7.4-4, right), 

f'W^vyir). (9) 


The inverse problem is then 


q L {r) = q j i , r.<r<r j+v 


a 

Q7 1= 

I i i 


The inverse problem is then converted from an integral to a 
matrix equation 


where the matrix elements are 


a 

r 

Q? = G,.(r) X l dr = £ A tj qf, 


Aij = G t (r)G j(r)dr. 


where the matrix elements are 


The model is found by inverting for the expansion coefficients 


A »= G i {r)dr. 


The observations are inverted for the value of the parameter 
qj 1 in each layer. 

By choosing a smaller number of layers than mode observa¬ 
tions, we obtain an overdetermined system of equations. As 


Data space inversion is less intuitive than parameter space 
inversion, but has the attractive features that the resulting 
model is a smooth function of depth, and need not be para¬ 
metrized in depth in advance. Moreover, it is in some sense 
“natural” to use the kernels as basis functions for the model, 
because the observations sample the model along these kernels. 
However, these solutions often seem too smooth for our in¬ 
stincts, just as the parameter space solutions often seem too 
jagged. We often both expect changes in properties near certain 
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Fig. 7.4-5 Comparison of various attenuation models. Despite the differences, all reproduce the general features of the data in Fig. 7.4-2, as shown m the 
right hand panels. (Stein et al, 1981. Anelasticity in the Earth, 39-53, copyright by the American Geophysical Union.) 


depths and are reluctant to force them into the solution. This 
dilemma is an example of the general issue of deciding how 
much we want the inversion solution to reflect our preconcep¬ 
tions, some of which may be correct, especially when derived 
from other data, and some of which may be incorrect. We can 
choose to focus on what the data require, what the data permit, 
or a combination of the two. 

These issues are illustrated in Fig. 7.4-5, which shows several 
models for attenuation as a function of depth, all generally con¬ 
sistent with the data in Fig. 7.4-2. Model SL8 was derived by 
parameter space inversion, whereas the others were derived 
from data space inversion. The lower two models were derived 
by inverting the data in Fig. 7.4-2 with different misfit func¬ 
tions, whereas the upper two were derived from different data. 
Although the models differ, all have low attenuation in the 
lower mantle, high attenuation in the upper mantle associated 
with the low-velocity zone, and moderate attenuation in the 
“lid” above the low-velocity zone. The models illustrate the 
range of acceptable solutions. For example, the high attenu¬ 


ation zone at the base of the mantle in model SL8 is permissible, 
and thus survives if included in the starting model, but is not 
required by the data. This ambiguity results from the fact that 
the data have little resolution for structure at this depth, as 
shown by the kernels in Fig. 7.4-3. 

7.4.3 Features of the solutions 

The inverse problem for attenuation (Eqn 5) has a simple form, 
because each mode’s quality factor depends linearly on q ml {r), 
so the observations can be inverted directly for the attenuation 
structure. If this is not the case, we linearize about a starting 
model (Section 7.2.1), so the change in a datum depends lin¬ 
early on the change in model parameters 

a 

A d i = GfrjAmirjdr. (12) 

o 
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Fig. 7.4-6 Inversion of Rayleigh wave phase 
and group velocity measurements for shear 
wave velocity structure beneath the Pacific, 
(a): Phase and group velocity partial 
derivatives at 40 and 100 s periods, (b): 
Starting (dotted line) model and final model 
derived by parameter space inversion. 
Horizontal lines indicate the model standard 
deviation in each layer, (c): Resolving kernels 
for various depths. The number and 
horizontal line indicate the depth for each 
kernel. (Yu and Mitchell, 1979.) 


Figure 7.4-6 illustrates a parameter space inversion for vertical 
shear velocity structure from Rayleigh waves. Using the partial 
derivatives 

(13) 

op op 

which show how the phase and group velocities at a particular 
period change in response to a shear velocity perturbation at 
each depth, the starting model is modified to fit the observed 
dispersion. The resolving kernels that illustrate the vertical 
“smearing” are largest at the depth for which they are com¬ 
puted, but have nonzero amplitudes at other depths. The best 


resolution occurs when the kernel is sharply peaked at the 
desired depth. 

As we noted earlier, the generalized inverse solution yields 
the minimum change in the model that best produces a desired 
change in the data. Hence the final model is as close to the start¬ 
ing model as possible. Features of a model derived by linearized 
inversion can thus depend on the starting model. For example, 
in a parameter space inversion, a layer whose value in the start¬ 
ing model is assumed to differ significantly from adjacent layet^ 
will often retain this feature in the solution. One way to avon 
this is to start off with a model whose properties are unifoi m 
with depth. In other cases, data not included in the inversion 
can be used to find a starting model more appropriate t an 
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a uniform one. Another approach is to do inversions with dif¬ 
ferent starting models and compare the resulting solutions. If 
the solutions differ, they are likely local minima of the misfit 
function (Eqn 7.2.11) that the inversion minimized, whereas 
if the different starting models yield the same solution, it is 
more likely to be the global minimum that we seek. Yet another 
approach is to search numerically for the minimum in the 
model space by varying the model parameters. Such “brute 
force” approaches, in which we solve the inverse problem by 
solving the forward problem many times, are attractive when 
the number of model parameters is small, because they avoid 
the issue of linearizing about the starting model and show the 
trade-offs between various parameters. For example, Fig. 5.3-8 
showed the trade-off between plate thickness and basal tem¬ 
perature in inverting oceanic depth and heat flow data for 
thermal structure. 

Parameter space and data space inversions can be carried out 
using more sophisticated variations. For example, parameter 
space inversion can be smoothed to reduce the jumps at layer 
boundaries. Data space inversion can be formulated in terms 
of a set of orthogonal kernels, rather than the actual kernels, 
which are often quite similar to each other. This approach 
expands the model in the simplest possible way with the 
minimum number of parameters. In addition, the model can be 
constrained to fit the data only within the error bars, rather 
than attempt to fit the mean value of each datum. 

Due to the structure of inverse problems and the range of 
possible techniques available, various solutions can generally 
be derived for a set of seismological observations. As a result, 
inverse problems remain an important research area. The 
choices, ambiguities, and trade-offs in the solutions of these 
problems are sometimes key features of the solution. Attempts 
to explain these issues can be frustrating to nonseismologists, 
as illustrated by the joke that in response to the question 
“How much is 2 + 2,” an engineer replies “3.9999,” a geologist 
replies, “Somewhere in the mid-single digits,” and a geo¬ 
physicist replies, “How much do you want it to be?” 

7.5 Inverting for plate motions 

We end our discussion of inverse problems with the issue of 
determining the Euler vectors that describe relative plate 
motions. As we have noted, these Euler vectors are derived in 
part from earthquake focal mechanisms, and are then used as 
a reference model to predict the directions and rates of plate 
motions for applications including estimating earthquake re¬ 
currence, slip partitioning, and the fractions of seismic and 
aseismic slip at plate boundaries. 

7,5.1 Method 

The forward problem (Section 5.2.1) is that at any point r 
along their boundary, the linear velocity of plate / with respect 
to plate i is 


v ; i = co ;i xr, (1) 

where co - is the relative angular velocity, or Euler vector. Hence 
the rate and direction of plate motion are given by the north- 
south and east-west components of v, 

rate = | v | = ^{v NS ) 2 + (z/ EW ) 2 , 

azimuth = 90° - tan -1 [{v m )/{v EW )]. (2) 

The corresponding inverse problem is to find a model, or 
set of Euler vectors, that best predicts the observed motions. 
Because Euler vectors can be added, assuming that the plates 
are rigid, m plates are specified by m - 1 Euler vectors, and thus 
their 3(m - 1) components. Hence we use a data vector d com¬ 
posed of rates and azimuths to estimate the model vector m 
composed of the Euler vector components. Both the model 
and data vectors consist of physically different quantities: the 
model vector is made up of Euler pole latitudes, longitudes, and 
rotation rates 

m= (0 1? 0 2 , . • • 02’ • ' • 0m-1’ I ^11’ I ^l’ • • * 

0 ) 

whereas the data vector contains rates and azimuths 

d = (r v r 2 ,. . . r ki az v az 2 ,. . . az n _ k ). (4) 

As written, the inverse problem is not linear because the data 
are complicated functions of the model parameters. Thus, as in 
the previous examples, we linearize about a starting model by 
forming the partial derivative matrix 


showing how a change in the ; th model parameter affects the 
prediction of the / th datum. The derivatives are found by differ¬ 
entiating the expressions for v NS and zT w (Eqn 5.2.7). We then 
have the usual equation 

Ad = GAm, or Ad ■ = ^7 G /; A m-, ( 6) 

i 

relating the changes in the data and the model. The system 
is usually overdetermined, because we generally have data at 
many sites and solve for only a few plate model parameters. For 
example, the NUVEL-1 model has 12 plates whose motions 
were estimated from 1122 data (Fig. 1.1-9). We thus use the 
weighted least squares solution 

Am = (G T WjG) -1 G T W^ Ad, (7) 

where the variance-covariance matrix of the data, W d = (cr^)" 1 , 
contains our estimates of the uncertainty in rates from 
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magnetic anomalies and the uncertainties in directions 
associated with estimating transform azimuths and deter¬ 
mining earthquake slip vectors. The weighted solution is 
needed because the uncertainties have different dimensions and 
vary between data points. 

Thus uncertainties in the estimated Euler vectors are given by 
the model variance-covariance matrix 


= (G t WjG)~ 


Uncertainties associated with the Euler poles are often shown 
by error ellipses analogous to those for earthquake locations, 
whereas those for the rates are quoted separately. Alternat¬ 
ively, we can view the pole and rate uncertainties as forming 
a three-dimensional ellipsoid. Hence two Euler vectors are 
distinct if their error ellipsoids do not overlap. As we have seen, 
conventional global plate motion studies using magnetic 
anomalies, transforms, and earthquake slip vectors yield solu¬ 
tions similar to those obtained by using the same formulation 
to invert the rates and azimuths of plate motions determined by 
space-based geodesy (Section 5.2.3). This agreement is gratify¬ 
ing, given that the conventional solutions combine data from 
magnetic anomalies averaged over millions of years, the azi¬ 
muths of transform faults that formed over long times, and the 
slip vectors of earthquakes, whereas the space-geodetic solu¬ 
tions based on data spanning only a few years have different 


uncertainties. 


7.5.2 Testing the results with % 2 and F -ratio tests 


Given a model derived by inversion, the natural question is, 
how good is it? This issue is a specific case of the general one 
of testing how well a model fits data, which is discussed in 
statistics texts. For our purposes we focus on two issues and 
note some results without proof. 

One common way to test how well a model fits data uses the 
misfit function % 2 that we minimized to derive the least squares 
solution (Eqn 7.2.11). We write it as 


x 2 = l 


(di - df) 


where df are the data predicted by the model, d i are the data 
observed, and are their uncertainties. Lower values of x Z cor¬ 
respond to better fits. However, because a model derived from 
these data is bound to fit better than one derived without them, 
we examine the reduced chi square 


xl^x 1 ^ 


where the parameter v, known as the number of degrees of 
freedom , equals n-p where n is the number of data and p is the 
number of model parameters estimated in the inversion. 

If the model is a good fit to the data and our estimates of the 
uncertainties are reasonable, then we expect %\ to be around 1. 



Fig. 7.5-1 Cumulative probability distribution P(^, v), giving the 
probability of observing %\ above a certain value, plotted for 10 and 100 
degrees of freedom. The more the degrees of freedom, the more likely i$ 

to be near 1, and the less likely much higher or lower values are. 


Statistically, this means that there is a reasonable possibility 
that the observed data are samples from a parent distribution 
described by the model, given the random uncertainties of meas¬ 


urement. However, if x\ is much larger than 1, it is unlikely 


that the data are samples from this distribution. This issue 
is addressed using the cumulative probability distribution 
P(xl, v) given by statistical tables or mathematical software 
that gives the probability of observing x\ above a certain value 
(Fig. 7.5-1). In other words, this test asks what the probability 
is that such a high value would be observed purely by chance 
due to the uncertainties of measurement. The more the degrees 
of freedom, the less likely a high value is. For example, the 
chance of observing x\ greater than 1.5 is about 13% for v= 10, 
but less than 1% for v= 100. Thus, the more data we have, the 
more the degrees of freedom, and closer to 1 we expect x 2 to be. 
This test does not tell specifically whether the data observed 
are samples from the distribution predicted by the model, but 
gives instead some insight into the probability. If is too large, 
there is likely to be something wrong. 

One possibility is that the model does not include some 
crucial factors. For example, a plate motion model may not 
include an important plate boundary, and so does not describe 
the data well. In this case, the misfit is greater than expected 
from considering only random uncertainties of measurement, 
because systematic errors are also present. Similarly, the misfit 
to travel time in an earthquake location includes both errors of 
measurement and the effects of velocity structure like lateral 
heterogeneity. We sometimes rescale the uncertainties to make 
x\ = 1, which lets us assign confidence limits using x\- This 
rescaling does not address the causes of the misfit, but impli¬ 
citly lumps the systematic errors in with the errors of measure¬ 
ment. To do better requires improving the model. 

Conversely, if ^Jis too small, Fig. 7.5-1 indicates that some¬ 
thing is also likely to be wrong. For example, for v= 10, there is 
only about a 2% chance of observing x\ less than 0.3, and the 
probability is less for more degrees of freedom. This is because 
the data are unlikely to be fit that well, given errors of measure¬ 
ment. About one-third (100 - 6 8 %) of the data should be misfit 
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m 


fiv at least la, and about 5% should be outside the 2arange, 
pjeitce a low %\ value, which we might view as showing an 
. N cellent fit, is more likely to imply that the uncertainties in the 
data have been overestimated, and have thus made %\ appear 
n)C> small. For example, %\ for the NUVEL-1 model is 0.24, 
whereas it is expected to lie with 95% probability between 0.93 
uid 1.07. This effect is also seen for other plate motion models, 
suggesting that the assigned data uncertainties are more like 
(2a) confidence limits than one standard deviation. If so, 
rhe uncertainties in the model are correspondingly less than 
implied by the model variance-covariance matrix. Thus the % 2 
rest formalizes the adage that if something seems too good to be 
rue, it probably is. 1 

A second issue is whether the number of model parameters is 
appropriate. As discussed in Chapter 5, there are often several 
possible plate boundary geometries for an area. Naturally, 
more plates can describe plate motions in an area better 
because the model has more parameters. Thus we ask whether 
rhe improved fit shown by a lower value of %\ is more than 
expected purely by chance due to the additional parameters. 
For example, a set of data in the x-y plane are always better 
tit by a higher-order polynomial, such as a quadratic versus a 
straight line. 

This issue can be addressed using the F-ratio test, which gives 
insight into whether a set of data are significantly better fit by a 
model with more parameters. The idea is that if a set of n data 
are fit by two models, one with r parameters (n - r degrees 
of freedom) and a second with p parameters (n-p degrees of 
freedom) with p greater than r, the second model should fit the 
data better, and x 2 iP) should be less than £ 2 (r). To test if the re¬ 
duction in x 1 is greater than would be expected simply because 
additional model parameters are added, we form the statistic 

r iX 2 (r) - xHp)V(P ~ r) ' (11) 

X 2 (p)!(n - p) 


r _ [X 2 (P plates) - X 2 iP + 1 plates)]/3 
X 2 {p + 1 plates)/(« - 3 p) 

is tested using P f (F, v v v 2 ) with v 1 = 3 and v 2 = (n- 3 p). If the 
risk that the improved fit would occur by chance is small, per¬ 
haps less than 1 %, then we treat the additional plate as distinct. 
Conversely, if the improved fit is likely to result simply from 
the additional parameters, the data do not strongly indicate 
the presence of an additional plate. For example, such tests 
show that although the boundary between them is indistinct, 
North and South America should be treated as separate plates. 
This approach is used to investigate complicated regions where 
the plate geometry is unclear, such as near Japan and in the 
Indian Ocean. Similarly, we can investigate regions of intraplate 
deformation to see whether there is resolvable motion. 

In many applications these or other statistical tests can be 
used to examine how well a model fits the data and to gain in¬ 
sight into whether the model is too simple (underparametrized) 
to explain the data or more complicated (overparametrized) 
than is required by the data. For example, we can examine 
cases when adding more layers to a velocity model significantly 
improves the fit to travel time data, when a more complex 
earthquake source model fits seismograms significantly better, 
or when a more complex model of earthquake recurrence de¬ 
scribes an earthquake history better. In these applications the 
statistical tests address only the data used, so a more complex 
model may be justified based on other data, even if it is not 
required by the data tested. Moreover, we often suspect that 
the earth is more complicated than we would like when using 
simple statistical models. In particular, we often have little a 
priori knowledge of how to estimate the random and system¬ 
atic errors. Even so, it is worth subjecting models to tests and 
seeing how well the data support our beliefs. This testing is a 
key part of the cycle (Fig. 1.1-8) by which models are refined 
using new data and model parameterizations. 


Statistical tables or mathematical software give the probability 
P f (F, v l9 v 2 ) of observing an F value greater than that observed 
for a random sample with = (p - r) and v 2 = {n-p). Thus, 
for example, if P F is 0.01, there is only a 1% chance that 
the improved fit of the model with more parameters is due 
purely to chance. Because this test depends on the ratio of j 2 , 
it is not affected if the uncertainties are consistently over- or 
under-estimated. 

We can use F to test whether the fit to n relative motion data 
of a model with p + 1 plates is significantly better than that of 
one with p plates. The p plate model has 3{p — 1) parameters 
(n - 3p + 3 degrees of freedom), whereas the p + 1 plate model 
has 3 p parameters (n - 3p degrees of freedom). Thus 


1 This approach has been used to argue that Mendel’s famous results in 1865 that 
established the science of genetics are so good — the probability of observing them is 
0.004% — that they are suspect. Similarly, instructors have used % 2 tests to show that 
students’ results reported in laboratory classes are so good that they are unlikely to 
have actually been obtained. 


Further reading 

Many discussions of inverse theory, including ours, are based on Lanczos 
(1961). Applications in the earth sciences, especially seismology, are dis¬ 
cussed in texts and reviews including Parker (1977), Aki and Richards 
(1980), and Menke (1984). Treatments of tomographic methods in seis¬ 
mology are given by Nolet (1987), Thurber and Aki (1987), Spakman and 
Nolet (1988), Humphreys and Clayton (1988), and Romanowicz (1991). 
Inversion for the properties of stratified media is reviewed by Wiggins 
(1972). 

Tests for goodness of fit are discussed in statistical texts such as 
Bevington and Robinson (1992) and Freedman et al. (1991); the latter 
treats the issue of Mendel’s results. Chase (1972) and Minster etal. (1974) 
present the inverse problem for plate motions; the latter gives the partial 
derivatives. Stein and Gordon (1984) and DeMets etal. (1990) discuss ap¬ 
plications of the F-ratio test to plate motions and intraplate deformation. 



I 


Problems 


1. Show the following matrix identities: 

(a) For an arbitrary (not square) matrix A, the matrices A T A and 
AA T are symmetric. 

(b) For an arbitrary (not square) matrix B and a symmetric 
matrix A, {B T AB) T =B T AB. 

(c) For square matrices A and B such that (AB) -1 exists, (AB) -1 = 

B-U' 1 . 

2. Show that if a square matrix G has an inverse, the inverse and gener¬ 
alized inverse are identical. 

3. Show that if the variance-covariance matrix of the data is diagonal, 
< l\ - (jfjSjj (with no summation implied), its inverse is another dia¬ 
gonal matrix W d = d-laj-. (Also with no summation implied.) 

4. Show that the model variance-covariance matrix (Eqn 7.2.32) cr^ = 
G~ g a\{G~ g ) T reduces to a 2 = <J 2 (G T G )~ 1 when the data errors are 
uncorrelated and equal, so the data variance-covariance matrix is a 
constant times the identity matrix, cr d = a 2 £ ; y. 

5. Show that if the data errors are uncorrelated but not equal, such that 
the data variance-covariance matrix of the data is the diagonal 
matrix <j\ = cj 2 TF with inverse W d (problem 3): 

(a) The least squares criterion (Eqn 7.2.14) for the inverse prob¬ 
lem gives rise to the weighted least squares solution Am = 
(G T W^G)“ 1 G T \^Ad. 

(b) The model variance-covariance matrix is or 2 = (G T W^G)” 1 . 

6. For a halfspace with uniform (and known) velocities a and jh 

(a) Show how the location problem can be formulated to use 
both P- wave and S-wave first arrival times as data. Write the 
data vector, model vector, and partial derivatives. Flow do 
these differ from the case for P waves alone? 

(b) Show how the location problem can be formulated to use only 
the difference between P- wave and S-wave first arrival times 
as data. Write the data vector, model vector, and partial 
derivatives. How do these differ from the case for P waves 
alone? How might you apply this method if only the P velocity 
were known? Under what conditions might this method be 
useful? 

7. For the idealized tomographic experiment in Figure 7.3-2: 

(a) Show how one row of the G matrix in Eqn 7.3.10 can be 
derived from the others, such that the four teleseismic ray 
paths are not linearly independent. Give a physical inter¬ 
pretation of this result. 

(b) Find four rows of the G matrix in Eqn 7.3.6 that are linearly 
independent, and give a physical interpretation of this result. 


Computer problems 


C-l. Write a subroutine to find the generalized inverse G~& = 
(G t G) _1 G t of an (n x r) matrix G, using a matrix inversion 


subroutine. As a test, check that the solution satisfies the criterion 
that for a square matrix G that has an inverse, the inverse and 
generalized inverse are identical. 

C~2. For a homogeneous halfspace with P -wave velocity a : 

(a) Write a subroutine to compute the distance and travel time 


Test this for 


between two points (x, y, z) and (x ( -, zf. Test this for 
some simple cases. 

(b) Use the result of (a) to write a program that reads an earth¬ 
quake location, origin time, and medium velocity and the 
locations of n seismic stations, and finds the first arrival 
time at each station. 

(c) Write a subroutine using the result of (a) to compute the 
partial derivatives'of the first arrival time at a station with 
respect to changes in the model parameters (location, 
origin time, and medium velocity). 

(d) Modify the result of (b) to compute arrival times for a start¬ 
ing model (assumed location, origin time, and medium 
velocity), and then locate the earthquake by inverting these 
synthetic data to find the best-fitting model. The result of 
C-l should be useful. Have the program iterate until the 
model change between iterations is less than a parameter 
you set. The program should have the option to invert for 
velocity or hold velocity fixed at an assumed starting value. 

C-3. Test the location program with a set of station locations, a “real” 
origin time and location, and an incorrect starting model. The 
program should retrieve the “real” model. Once this works for 
error-free data, add some errors to the travel times, either by using 
your computer’s random number function or by simply choosing 
some numbers. Invert for the best-fitting model, and see how 
the result of the inversion changes as the errors become a larger 
fraction of the travel times. How do the results depend on 
whether the velocity is held fixed or inverted for? 

04. Compute and compare % 2 and %\ for C-3 for cases in which 
you inverted for velocity and in which the velocity is fixed at an 
incorrect value. Using the F-ratio test, does the improved fit due to 
inverting for velocity seem significant? 



Appendix: Mathematical and 
Computational Background 


If you wish to learn about nature, to appreciate nature, it is necessary to understand the language she speaks in. She offers her informa¬ 
tion only in one form; we are not so unhumble as to demand that she change before we pay attention. 

Richard Feynman, The Character of Physical Law (1982) 


A.l Introduction 

The study of seismology follows a pattern characteristic of 
many scientific disciplines. We first identify phenomena that 
we seek to understand, such as the propagation of seismic 
waves through the solid earth. We then consider the physics of 
the simplest relevant case, such as the propagation of a wave 
of a single frequency through a uniform material, formulate 
the problem mathematically, and derive a solution. From this 
solution, we build up mathematical solutions to more complex 
problems, each of which is ideally a better approximation to 
the complexities of the real earth. Although the simpler pro¬ 
blems can be solved analytically, eventually the complexities 
require numerical techniques. 

We thus rely on a set of mathematical techniques often used 
in physical problems. Experience suggests that although many 
readers are familiar with most of the mathematics required 
in this book, a review is often helpful. This appendix briefly 
summarizes a broad range of material. The first sections treat 
a variety of mathematical topics. The final section reviews 
some concepts relevant to the use of computers for scientific 
calculations. 

In using these mathematical techniques, it is worth bear¬ 
ing in mind that we are invoking the special power of math¬ 
ematics to deal with physical problems. This power is that if a 
physical problem is posed correctly in mathematical terms, then 
applying mathematical techniques to this formulation yields 
quite different, and often apparently unrelated, statements 
that also correctly describe the physical world. For example, 
in Section 2.4 we used the equations of elasticity and applied 
vector calculus to derive the properties of seismic waves that 


we observe. Similarly, in Section 2.5 we derived an observed 
physical relation, Snell’s law, starting from three different phys¬ 
ical formulations. Conversely, we have seen that different phys¬ 
ical phenomena can be described using similar mathematical 
approaches and so have some deep similarities. Although in 
hindsight such successes may not seem surprising, because 
many of the mathematical methods we use were developed to 
solve such physical problems, they illustrate the intimate con¬ 
nection between sciences like seismology and mathematics. 1 

A.l Complex numbers 

In several of our applications, notably in describing propagat¬ 
ing waves and their frequency content, complex numbers are 
helpful. We thus briefly review some of their properties. 

The complex number z = a + ib, where i = -y—1, has a real 
part, a , and an imaginary part, b. These relations are sometimes 
written a = Re {z) and b = Im (z). Complex numbers are typ¬ 
ically plotted in the complex plane with their real parts on the 
x 1 axis and their imaginary parts on the x 2 axis (Fig. A.2-1). 
Alternatively, a complex number can be written in polar coor¬ 
dinate form as 

z = a + ib = re te =r{cos 6+i sin 0). (1) 


1 Most seismologists are more conservative than Paul Dirac, a leader in the 
development of quantum physics, who invented the delta function. Dirac regarded 
mathematical beauty as a guiding principle, stating that “it is more important to have 
beauty in one’s equations than to have them fit experiment.” 
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Real part 


Fig. A.2-1 A number in the complex plane can be represented in terms of 
its real and imaginary parts, z = a + ib, or in polar form z = re lB . 


By combining 

e* 0 =cos 0+z sin 0 and e~ t6 = cos 6-is'mO (9) 

we obtain the definitions of the sine and cosine functions in 
terms of complex exponentials 

cos 9= {e te +e~ t6 )/l and sin 0= (e 10 -e~ td )/2i. (10) 

These relations yield formulae for the trigonometric functions 
of the sum of the angles because 


,*'(01 +@ 2 ) 


cos (+ 0 2 ) + i sin {6 1 + 0 2 


and, by Eqn 6. 


The polar coordinates , the magnitude r and the phase angle 0, 
can be expressed in terms of the real and imaginary parts as 

r = 4 a 1 + b 2 , 0=tan~ x (b/a). (2) 

and, conversely, 

a = r cos 0, b = r sin 0. (3) 


e i(ei+e 2 ) _g* 0 i£* 02 -( cos $ l + i s i n 0 1 )(cos 0 2 + i sin 0 2 ) 

= (cos 0j cos 0 2 - sin 6 1 sin 0 2 ) 

H- /(sin 0 1 cos 0 2 + cos 6 1 sin 0 2 ), 

so we can equate the real and imaginary parts and find 

cos (6 1 + 0 2 ) = cos 0 X cos 0 2 - sin 0 X sin 0 2 


To describe complex numbers in all four quadrants of the com¬ 
plex plane, 0ranges from 0 to 2k. Because the inverse tangent is 
periodic with period k, the signs of the real and imaginary parts 
are used to obtain the correct phase. 

Complex numbers are equal when they have the same real 
and imaginary parts. Two complex numbers in (a + ib) form are 
added by adding the real parts and the imaginary parts: 

(<sq + ibp) + (a 2 + ib 2 ) = (a 1 + a 2 ) + i(b 1 + b 2 ). (4) 

Complex numbers can be multiplied either in the (a + ib) form: 

(a 1 + ib 1 )(a 2 + ib 2 ) = (a x a 2 - b 1 b 2 ) + i(a t b 2 + b 1 a 2 ), (5) 


sin (0 X + 0 2 ) = sin 6 1 cos 0 2 + cos 0 1 sin 0 2 . 


These expressions are symmetric in 0 1 and 0 2 , as expected. The 
corresponding relations for the trigonometric functions of the 
difference of two angles are found by making 0 2 negative. Set¬ 
ting 6 1 = 0 2 gives expressions for cos (20) and sin (20). 

The relations for the product of trigonometric functions of 
two angles can also be found using complex exponentials 


cos 0, cos 6 n 


[q i6 \ + e ld i) (e l$1 + e~ tdl ) 


or in the magnitude and phase form: 


r,e iGl ne id2 = r^e i{6l+e2) . 


The conjugate of a complex number z, z *, has the same real 
part and an imaginary part of opposite sign. Because 

z* =a~ib = r cos 0- ir sin 0 

-r cos (“0) + z>sin (-0) = re"* e , (7) 


and, similarly, 


_[(g*’(01+02) _(_ ^-/'(0l+02)) _|_ (^*(01-02) _(_ q-'AQ 1- 

4 


[cos (0 X + 0 2 ) + cos ( 9 1 - 0 2 )] 


(e 1 ® 1 - e tBl ) (e tdl — e ldl 


sin 6 1 sin 0 2 = 


the conjugate has the same magnitude but the opposite phase. 
Hence the square of the magnitude of a complex number can be 
found by multiplication by the complex conjugate, 


z \ 2 = zz*= {a + ib)(a - ib) = (a 2 + b 2 ) = re l6 re i6 = r 2 . 


= _ [( g *(01-02) _|_ £-<(01-02)) _ (£<(01+02) _|_ £-<(01 + 02) 

4 


[cos (0 X - 0 2 )-cos (6 1 + 0 2 )]. 
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Fig. A.3-1 A vector u is expressed by the Cartesian unit basis vectors and 
its components: u = + u 2 e 2 + w 3 e 3 . 

A.3 Scalars and vectors 

A3.1 Definitions 

In seismology, we deal with several types of physical quan¬ 
tities. The simplest, scalars , are numbers describing a physical 
property at a given point that is independent of the coordinate 
system used to identify the point. Temperature, pressure, mass, 
and density are familiar examples. Mathematically, if a point 
is described in one coordinate system by (x 1? x 2 , x 3 ) and in a 
second by (x 3 , x 2 , x 3 ), the value of a scalar function (f> in the 
first coordinate system equals that of the corresponding scalar 
function in the second 

0(x 1 ,x 2 ,x 3 ) = 0'(xj,x 2,3:3). (1) 

The distance between two points is a scalar because although 
the coordinates of the points depend on the coordinate system, 
the distance does not. 

Vectors are more complicated entities that have magnitude 
and direction. In seismology, the most common vector is 
the motion, or displacement , of a piece of material within the 
earth due to the passage of a seismic wave. Vectors transform 
between different coordinate systems in a specific way. Thus, if 
the horizontal ground motion is recorded with seismometers 
oriented northeast-southwest and northwest-southeast, the 
north-south and east-west components of the displacement 
can be found using the properties of vectors. We will see that 
although the components depend on the coordinate system, the 
magnitude and direction of the vector remain the same. 

Consider the familiar Cartesian coordinate system (Fig. A.3 - 
1) with three mutually perpendicular (orthogonal) coordinate 
axes. There are two standard notations for these coordinates 
and axes: either the x v x 2 , and x 3 , or the x, y, and z axes. Each 


X3 x 3 



Fig. A.3-2 A vector u is described in each of two orthogonal coordinate 
systems by the Cartesian unit basis vectors of the coordinate system and 
the components of the vector in the coordinate system: u = u l t 1 + u 2 e 2 + 
u 3 e 3 - u\t\ + u 2 e 2 + w 3 e 3 . Although the components differ between 
coordinate systems, the vector remains the same. 

notation has advantages. The x x , x 2 , x 3 notation is more con¬ 
venient for some derivations, and the x, y, z notation is some¬ 
times clearer in physical problems. We use the x v x 2 , and x 3 
notation in this appendix, and use whichever notation seems 
more convenient in other discussions. 

A point in this coordinate system is described by its x l9 x 2 , 
and x 3 coordinates. Because a vector can be defined by a line 
from the origin (0, 0, 0) to the point (u v u 2 , u 3 ), the three num¬ 
bers u l9 u 2 , and u 3 are the components of the vector u. A vector 
is denoted either by boldface type or by a set of its components 

u = {u v u 2 ,u 3 ) = {u xJ u r u z ). (2) 

A Cartesian coordinate system is described by three ortho¬ 
gonal unit basis vectors, e l5 e 2 , and e 3 , along the x v x 2 , and x 3 
coordinate axes: 

e-t = (1,0, 0) e 2 = (0,1,0) e 3 = (0,0,1). (3) 

The caret, or “hat” superscript, indicates a unit vector , whose 
length is 1. The vector u is formed from its components and the 
basis vectors 

u = M 1 e 1 + « 2 e 2 + « 3 e 3 = (w 1 , m 2 , « 3 ). (4) 

Now, consider a second Cartesian coordinate system with 
the same origin and different axes xj, x 2 , and x 3 , along which 
unit basis vectors e 3 , e 2 , and e 3 are defined (Fig. A.3-2). In this 
coordinate system the components of u are different, 

u = + u' 2 e 2 + u' 3 q 3 = {u' v u 2 , u 3 ). 


( 5 ) 
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Fig. A.3-3 A vector in two dimensions making an angle 9 with the axis. 


Thus the same physical vector is represented in a different 
coordinate system, described by a different set of basis vectors, 
using different components. The essential idea is that the 
vector remains the same, or invariant, regardless of the coordin¬ 
ate system, although the numerical values of its components 
change. Physical laws, like Newton’s law stating that the force 
vector equals the product of the mass and the acceleration 
vector (the second derivative with respect to time of the dis¬ 
placement vector), are written in vector form because the phys¬ 
ical phenomenon does not depend on the coordinate system 
used to describe it. 

The length or magnitude of a vector, | u |, is a scalar, and thus 
the same in different coordinate systems. By the Pythagorean 
theorem, the length is 

I U \ - (u\ + u\ + U^) 112 = {uf + u' 2 + uf) 112 . (6) 

The zero vector, 0, all of whose components are zero in any 
coordinate system, has zero magnitude. 

A vector is specified in either Cartesian coordinates by its 
components or in polar coordinates by its magnitude and direc¬ 
tion. For example, in a two-dimensional (x v x 2 ) coordinate 
system (Fig. A.3-3), the vector v can be written in terms of its 
components 

y = (v v v 2 ) (7) 

or its magnitude 

\v\ = (v\ + v\) m (8) 

and direction, given by the angle 0 that v makes with the x 1 
direction 

6= tan -1 [v 2 /v 1 ). (9) 

Just as | v | and 0 are given by the components, so the compon¬ 
ents are given by | v | and 6 

iq = |v|cos0 and y 2 = |v|sin0. (10) 

By analogy, a vector in three dimensions is specified by either 
its three components or its magnitude and the angles it forms 
with two of the coordinate axes. It is worth noting that the 


mathematical convention of defining angles counterclockwise 
from Xj differs from the geographical convention of defining 
angles clockwise from North (x 2 ), so conversions are often 
needed. 

A.3.2 Elementary vector operations 

The simplest vector operation is multiplication of a vector by a 


scalar 

au = {au v ccu 2 , au 3 ). (;m 

For example, in two dimensions, 

av = (av v av 2 ) ( 12 ) 

yields a vector with magnitude 

((avy + (av 1 j 1 ) m = \a \ (v\ + v\) ln - = \a\ | v | (13) 

whose direction is given by 

tan 6= av 2 /av 1 = v 2 lv v (14) 


Multiplication by a positive scalar thus changes the magnitude 
of a vector but preserves its direction. Similarly, multiplication 
by a negative scalar changes the magnitude of a vector and re¬ 
verses its direction, u, a unit vector in the direction of u is 
formed by dividing u by its magnitude 

U = u/|u|. (15) 

The sum of two vectors is another vector whose components 
are the sums of the corresponding components, so if 

a = a 1 e 1 +a 2 e 2 + a 3 e 3 and b = b 1 e 1 +b 2 e 2 + b 3 t 3 , 
a + b = (a 1 + fij)^ + (a 2 + b 2 )e 2 + (a 3 + b 3 )e 3 = b + a. (16) 

Addition can be done graphically (Fig. A.3-4) by shifting one 
vector, while preserving its orientation, so that its “tail” is at 
the “head” of the other, and forming the vector sum. For ex¬ 
ample, the total force vector acting on an object is the vector 
sum of the individual force vectors. Equation 16 and Fig. A.3-4 
show that vector addition is commutative; it does not matter in 
which order the vectors are added. 

A.3.3 Scalar products 

There are two methods of multiplying vectors. The first, the 
scalar product (also called the dot product or inner product), 
yields a scalar: 

a • b = a 1 b 1 + a 2 b 2 + a 3 b 3 = \ a 11 b | cos 6 , (17) 

where 0 is the angle between two vectors. To see that the two 
definitions of the scalar product are equivalent, consider a two- 
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Fig. A.3-4 Addition of vectors a and b. The addition can be done 
analytically, by adding components, or graphically. Vector addition is 
commutative, as the order of addition is irrelevant. 


X 2 



Fig. A.3-5 Derivation of alternative definitions of the scalar product a • b 
in two dimensions. 

dimensional case (Fig. A.3-5) with a = {a v a 2 ) and b = (b v b 2 ).If 
a and b make angles 0 1 and 0 2 with the tq axis, then 

a • b = | a 11 b | cos 0= \ a 11 b | cos {0 2 - 0 X ). (18) 

Using a trigonometric identity (Eqn A.2.13) we expand 

cos 0=cos {0 2 - 6^ = cos 0 2 co s f^ + sin 0 2 sin Q v (19) 


Equation 17 shows several features of the scalar product: 

• The scalar product commutes: a • b = b • a. 

• The scalar product of two perpendicular vectors is zero, 
because cos 90° = 0. 

• The scalar product of a vector with itself is its magnitude 
squared: 

a * a — a 3 a 3 T a 2 a 2 -f a 3 a 3 —| a | • {.‘32.) 

The definition of the scalar product is generalized for vectors 
with complex components. To see why, note that for a vector 
a = (*, 1, 0), where i - Eqn 22 would give a squared mag¬ 
nitude of zero. Because we would like only the zero vector, all 
of whose elements are zero, to have zero magnitude, Eqn 17 is 
generalized to 

a 'b = a^b 1 +a^b 2 + a^b 3 (23) 

where * indicates the complex conjugate. Thus the definition of 
the squared magnitude (Eqn 22) becomes 

a • + a%a 2 + a^a 3 = \ a | 2 . (24) 

For example, the squared magnitude of | (i, 1, 0) | 2 = (*)(-*) + 
(1)(1) = 2. These complex definitions reduce to the familiar 
cases, (Eqns 17 and 22), for vectors with real components. 

The relations between the unit basis vectors for a Cartesian 
coordinate system, e 1? e 2 , and e 3 , are easily stated using their 
scalar products. Because each is perpendicular to the other two, 
the scalar product of any two different ones is zero, 

$1 • e 2 = ei • e 3 = e 2 • e 3 = 0, (25) 

and the scalar product of each with itself is its squared 
magnitude 

e 3 • ej = e 2 • e 2 = e 3 • e 3 = 1. (26) 

The unit basis set of vectors is orthonormal ; each is ortho¬ 
gonal (perpendicular) to the others and normalized to unit 
magnitude. 

The projection , or component of a vector in a direction given 
by a unit vector, is the scalar product of a vector with the unit 
vector. Using this idea, a component of a vector can be found 
from its projection on the unit basis vector along the corres¬ 
ponding axis. Thus the x t component of u is 


Because 

cos 6 1 = a 1 I{a\ + a 2 ) 112 and sin 0 1 = a 2 l{a\ + a 2 ) m . 


( 20 ) 


and similar definitions hold for 0 2 and b, substitutions for the 
angles in Eqn 18 show that 

,." b |co. e .^ |21 , 


u • e 3 = (ttjej + u 2 e 2 + u 3 e 3 ) • = u v (27) 

with the other components defined similarly. 

A3.4 Vector products 

A second form of multiplication, the vector or cross product, 
forms a third vector from two vectors by 
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Fig. A.3-6 Illustration of the right-hand rule giving the orientation of the 
vector product a x b. 


a x b — (a 2 b 3 a 3 b 2 )e x + {a 3 b 3 ~~a 3 b 3 )Q 2 
+ (a x b 2 ~ a 2 b x )c 3i 

which can be written as the determinant 


a x b = 


61 e 2 


a x a 2 

K b 2 



(28) 


(29) 


The vector product of two vectors is perpendicular to both 
vectors. For example, if a and b are in the x 1 -x 2 plane, a 3 = b 3 
= 0, and by Eqn 28, the vector product has only an e 3 com¬ 
ponent. This can be shown in general by evaluating a • (a x b) = 
b • (a x b) = 0. Geometrically, the direction of the vector prod¬ 
uct is found by a “right-hand rule” (Fig. A.3-6): if the fingers 
of a right hand rotate from a to b, the thumb points in the 
direction a x b. The magnitude of the cross product is 

| a x b | = | a 11 b | sin 6 >, (30) 

where 0 is the angle between the two vectors. The cross product 
is zero for parallel vectors because sin 0 ° = 0 , so the cross prod¬ 
uct of a vector with itself is zero. 

The vector product often appears in connection with rota¬ 
tions, such as those used to describe the motion of lithospheric 
plates (Section 5.2). For example, if an object located at a 
position r undergoes a rotation, its linear velocity v is given by 


where co is the rotation vector, which is oriented along the axis 
of rotation, with a magnitude | co | that is the angular velocity 
(Fig. A.3-7). Similarly, the vector product is used to define the 
torque, which gives the rate of change of angular momentum. 
A force F, acting at a point r, gives a torque 


T=r xF. 


(32) 



Fig. A.3-7 The vector product v = cox r describes a rotation. 


*2 



Fig. A.3-8 The x 3 component of the vector product T=rxF gives the 
torque, r l F 1 - r 2 F l about the X 3 axis. In this case r,F 2 is greater than r 2 F v 
so counterclockwise rotation about the x 3 axis occurs. 

For example, the torque about the * 3 axis is r 3 = {r x F 2 - r 2 F { ), 
so each component of the force contributes a counterclockwise 
torque equal to the component times its lever arm, the perpen¬ 
dicular distance of the point from that axis (Fig. A.3-8). 

Some useful identities, whose proofs are left as problems, are 

a • (b + c)=a • b + a • c 

a x (b + c)= a x b + a x c 

a • (bxc)=b • (cxa) = c • (axb) 

a x (b x c)=b(a • c) - c(a • b). (33) 

A3.5 Index notation 

Vector equations, such as the definition of the cross product, 
can be cumbersome when written in terms of the components. 
Simplification can be obtained using index notation , whereby 
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a n index assuming all possible values replaces the subscripts 
indicating coordinate axes. For example, the vector u = {u v u 2 , 
uj is written where i can be 1, 2, or 3. In this notation, the 
calar product is 

a-b = a 1 b 1 + a 2 b 2 + a 3 b 3 = ^afy. (34) 

i =1 

Because the sum over all coordinates appears frequently, the 
-■Einstein summation convention is often used, whereby an 
index repeated twice implies a summation over that index, and 
the summation sign is not explicitly written. Hence the scalar 
product of two real vectors is written 

a-b = a i b i , (35) 

using implied summation over the repeated index i. Similarly, 
the square of the magnitude of a real vector is 


A repeated index is called a “dummy” index, like a dummy 
variable of integration, because it is used only within the sum¬ 
mation. The form of the expression indicates that u i u j is a 
scalar; because the repeated index is summed, no index remains 
“free.” By contrast, u i is a vector, because there is a free index. 

Index notation is further simplified by introducing two sym¬ 
bols, 8- and e i]k . The Kronecker delta , <5 /; , is defined 

<5f/ = 0 if/*/, 

= 1 if/=/. (37) 

So, for example, 8 U = 1, but 8 12 = 0. Using the Kronecker 
delta symbol, the relations between the Cartesian basis vectors 
(Eqns 25,26) can be written compactly as 

er*r s ir (38) 

The Kronecker delta, a function of two discrete variables i and 
/, is analogous to the Dirac delta function which is a function of 
a continuous variable (Section 6.2.5). 

The permutation symbol , £^ k , is defined as 


£ ifk = 0 


if any of the indices are the same, 

if/,/, k are in order, i.e., (1,2, 3), (2, 3,1 
or (3,1,2) 

1 if Z, /, k are out of order, 

i.e., (2,1,3), (3,2,1), (1,3,2). 


Cases where the indices are in order are known as even, or 
cyclic, permutations of the indices; those in which the indices 
are out of order are known as odd permutations. Because of the 
symmetries In the definition, £ ijk = £- ki = £ kij . A useful relation, 
whose proof is left for the problems, is 


Using index notation, the definition of the vector product 
(Eqn28) becomes 


(axb), = X X £ijk a j b k = e ijk a i b k> < 41 > 

/=1 k=l 

where the last form uses the summation convention. The nota¬ 
tion shows that the cross product yields a vector because only 
one index, /, remains free after the repeated indices j and k are 
summed. To see that the index notation gives the correct defini¬ 
tion, we expand the i = 2 component as 

{axh ) 1 = £ 2 n a 1 b 1 + £ 2 l 2 a l b 1 + £ 113 a 1 b 3 + £ 1 2i a 2h + £ 222 a 2 b 2 

+ £ 223 a 2^3 + £ 231 ^ 3^1 + £ 232 a 3^2 + £ 233 a 3^3 

= {a 3 b 1 -a 1 b 3 ), (42) 

because the only nonzero £^ k terms are e 213 = -1 and £ 231 = 1. 

Index notation points out an interesting feature of the vec¬ 
tor product. Because a i b i = bfi^ the scalar product commutes. 
By contrast, the properties of the permutation symbol show 
that 


axb = e ijk a j b k 


- £ Hk b i a k 


so the order matters for the vector product. 

Although index notation seems unnatural at first, it does 
more than simply shorten expressions. The notation explicitly 
indicates what operations must be performed, and thus makes 
them easier to evaluate. For example, suppose we seek to show 
that the cross product of a vector with itself is zero. In contrast 
to (a x a), the notation £ ijk a j a k shows how the cross product 
should be evaluated. Because a-a k is symmetric in the indices 
/ and k, the permutation symbol makes the terms involving 
any pair of / and k sum to zero. We will see that index notation 
makes the complicated expressions that we encounter in study¬ 
ing stress and strain easier to evaluate. 

A3.6 Vector spaces 

These concepts for vectors can be generalized in several ways. 
In three dimensions any vector is a weighted combination of 
three basis vectors. The usual choice of basis vectors along 
coordinate axes is for simplicity. We could choose any three 
mutually orthogonal vectors, which need not be of unit length, 
to be the basis vectors. To see this, remember that a physical 
vector does not depend on the coordinate system. 

Moreover, the idea of vectors in two- or three-dimensional 
space can be generalized to spaces with a larger number of 
dimensions. For example, given unit vectors 

^ = (1,0, 0,0,0), e 2 = (0,1,0,0,0), e 3 = (0,0,1,0,0), 


e 4 = (0,0,0,1,0), e 5 = (0,0,0,0,T 
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a vector u can be formed from the basis vectors and components 
u = u \&\ + u i &2 + w 3^3 + u aK + u s e 5 = (u v u 2 , u 3 , « 4 , u 5 ). (45) 

This vector is defined in a five-dimensional space, with five axes 
each orthogonal to the others, because their scalar products are 
zero. Although this is difficult to visualize (or draw), the math¬ 
ematics carries through directly from the three-dimensional 
case. N mutually orthogonal vectors thus provide a basis for an 
N-dimensional space. 

These ideas are formalized in terms of vectors in a general 
linear vector space . For our purposes, a vector space is a collec¬ 
tion of vectors x, y, z, satisfying several criteria: 

• The sum of any two vectors in the space is also in the 
space. 

• Vector addition commutes: x + y = y + x. 

• Vector addition is associative: (x + y) + z = x + (y + z). 

• There exists a unique vector 0 such that for all x, x = x + 0. 

• There exists a unique vector -x such that for all x, x + (-x) 
= 0. 

• Scalar multiplication is associative: a(fix) ~(afi)x. 

• Scalar multiplication is distributive: a(x + y) = ax + ay 
and (a+ p)x = (ax + fix). 

A point worth considering is the number of independent 
vectors in a vector space. Given N vectors x 1 , x 2 , . . ., x N in a 
linear vector space, a weighted sum Za-x z is called a linear 
combination , The N vectors are linearly independent if 

N 

a { x l = 0 only when all cq= 0, (46) 

i=l 

so that no vector can be expressed as a combination of the 
others. Otherwise, the vectors are linearly dependent , and one 
can be expressed as a linear combination of the others. 

This idea corresponds to that of basis vectors. If N basis 
vectors are mutually orthogonal, they are linearly independent. 
Because any vector in an N-dimensional space is a linear com¬ 
bination of N linearly independent basis vectors, the basis 
vectors span the space. Thus the dimension of a vector space 
is the number of linearly independent vectors within it. For 
example, we cannot find four linearly independent vectors in 
three dimensions. 

Though vector spaces sound abstract, they are useful in 
seismology. For example, in Chapter 2 we represent travelling 
waves by normal modes, which are orthogonal basis vectors in 
a vector space, so any wave is a weighted sum of them. The 
modes of a string (Section 2.2.5) form a Fourier series (Chap¬ 
ter 6), in which a function is expanded into sine and cosine 
functions that are the basis vectors of a vector space. A sim¬ 
ilar approach is also used for the modes of the spherical earth 
(Section 2.9). Vector space ideas are also used in inverting 
seismological observations to study earth structure (Chapter 
7). 


A.4 Matrix algebra 

A. 4.1 Definitions 


Matrix algebra is a powerful tool often used to study systems of 
equations. As a result, it appears in seismological applications 
including stresses and strains, locating earthquakes, and seismic 
tomography. We thus review some basic ideas, often stating 
results without proof and leaving proofs for the problems. Fur¬ 
ther discussion of these topics can be found in linear algebra texts. 

Given a matrix A with m rows and n columns, called an 
mxn matrix, 


/ 

#11 

#12 

... a ln ' 

#21 

a 22 

* * * a 2n 

K a tnl 

a m2 

* • * a mnJ 


and a second matrix B, also with m rows and n columns, 
matrix addition is defined by 


A + B = 


#n + b n 

#12 + b 12 

• • • ^ + K ' 

#2i + b 2 1 

a 22 + ^22 

• • * a 2n + &2 n 

K U m\ + Kl 

^m2 A2 

• ‘ ' a mn + b mn j 


The usual convention is to indicate matrices with capital letters 
and their elements with lower-case ones. 

Matrix multiplication is defined such that for a matrix A that 
is mxn and a matrix B that is n x r, the ij th element of the m x r 
product matrix C = AB is defined by 


c n= 'L a ik b kr a iA i - 

k=l 


The ij th element of C is the scalar product of the i th row of A 
and the ;' th column of B. As a result, for matrix multiplication 
the two matrices need not have the same number of rows 
and columns, but must have the number of columns in the first 
matrix equal to the number of rows in the second. Often the 
numbers of rows and columns in the two matrices allow multi¬ 
plication in only one order. Thus, in the example above, A 
“premultiplies” B, or B “postmultiplies” A. A convenient way 
to remember this is that the number of columns in the first 
matrix must equal the number of rows in the second, but this 
dimension does not appear in the product. In the case of AB = 
C, written schematically, we have [m x n][n x r] = [m x r I- 
Hence, in the final form in Eqn 3, the summation convention 
shows that k is summed out, leaving i and j as free indices, so c.. 
is a matrix element. Furthermore, even if both AB and BA are 
allowed, the two products are generally not equal, so matrix 
multiplication is not commutative. 
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The identity matrix , I, is a square matrix (one with the same 
number of rows and columns) whose diagonal elements are 
equal to 1 while all other elements are 0: 

0 ... 0 o' 

. 0 1 ... 0 0 


0 0 
0 0 


column vectors represented by n x 1 matrices with n rows and 
one column 


whose transposes are row vectors (one row, n columns) like 


The identity matrix has the property that for any square matrix 

A, 

AI = /A = A. (5) 




Nonetheless, to save space, we sometimes write 


The transpose of a matrix A, A T , is derived by placing the 
rows of A into the columns of A T , so for C = A T , 




c <rV 


The transpose has the properties that for matrices A and jB, 

(A + jB) t = A t +B t and ( AB) T = B T A T . (7) 

With these definitions, vector operations can be expressed 
using matrix algebra, by treating vectors as matrices with one 
column. For example, premultiplication of a vector by a matrix 
yields another vector, y = Ax, such that 

Vi= or y, = a ij *;> < 8 > 

/ 

where the second form uses the summation convention. Each 
component y i is the scalar product of the i th row of A with x. 
Similarly, the scalar product of two vectors is given by the 
matrix product 

a-b = a T b = J^a i b i = a i b l . (9) 

i 

Thus the scalar product of two vectors yields a scalar, because a 
1 x m matrix times anmxl matrix is a 1 x 1 matrix, or single 
value. The squared magnitude of a real vector can be written as 

IU | 2 = u • u = u T u= ^ U i U i = U i U i . (10) 

i 

For vectors with complex components, the scalar product 
(Eqn A.3.23)is 

ab = a* T b = £ <&, = <&,, (11) 

i 

This brings us to a minor point of notation. In linear algebra, 
as in the last few equations, it is common to treat vectors as 


while treating u as a column vector when required. Strictly 
speaking, we should call the row vector u T . 

We often encounter matrices that are symmetric , or equal 
their transposes, 

A = A t , a ij = a ji . (15) 

For a matrix A with complex elements, the conjugate matrix 
A* is formed by taking the conjugate of each element, and the 
transpose is generalized to the adjoint matrix A + = A* T , which 
is the complex conjugate of A T . Note that if the elements of A 
are real, A + = A T . A matrix A is Hermitian if it equals its adjoint 

A = A + , a ij = a^ i . (16) 

If A is real, “Hermitian” and “symmetric” are equivalent. 


A.4.2 Determinant 

A useful entity is the determinant of a matrix, written det A, or 
| A |- For an nxn matrix, 


det A= X X • • ■ X ■ • •/») 


: Cl'-): • • • Cl * 


h= l h= l Jn z 


This complicated sum over n indices, j v j 2 , uses a genera¬ 

lized form of the permutation symbol 

s()v il> ■ • • in) = s § n II (/« -ip)- < 18 ) 


The sgn function is one times the sign of its argument, so that it 
equals 1 if its argument is positive, -1 if its argument is negat¬ 
ive, and 0 if its argument is zero. For n = 3, 

s(h,h,h) = s s n [(/2-/1H/3-/1H/3 -/i)]. 
so that, for example, 
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5(1,2, 3) = 1, S ( 2,1, 3) =-l, 5(1,1, 3) = 0, 


( 20 ) 


Because s(j v j 2 ,j 3 ) suppresses terms with two equal indices, and 
assigns others a sign depending on the order of the indices, it is 
the same as the permutation symbol, £ /1/2/3 (Eqn A.3.39). 

The definition of the determinant gives the familiar result for 
n = 2: 


| A | = det 


/ ^ 
a n a i2 

\ a 21 a 22) 


2 2 

~ S ^L s (ivl2) a lj l a 2j 1 

/i = l/ 2 =l 

= 5(1, l)a n a 21 +s(l, 2)a n a 21 + s(2,1 )a 12 a 21 + s( 2,2)# 12 ^ 22 

( 21 ) 


: ^11^22 ^12^215 


because s(l, 1) = s(2, 2) = 0, s(l, 2) = 1, and s(2, 1) = -1. For 
a matrix with only one element, the determinant equals the 
matrix element. 

Among the properties of determinants that we will find 
useful in solving systems of equations are: 

® The determinant of a matrix equals that of its transpose 

\a\ = \a t \. 

• If two rows or columns of a matrix are interchanged, the 
determinant has the same absolute value but changes 
sign. 

• If one row (or column) is multiplied by a constant, the 
determinant is multiplied by that constant. 

If a multiple of one row (or column) is added to another 
row (or column), the determinant is unchanged. 

• If two rows or columns of a matrix are the same, the 
determinant is zero. 

Proving these properties is left for the problems. 

A A. 3 Inverse 

For an n x n square matrix A, the inverse matrix A” 1 is defined 
such that multiplication by the inverse gives the identity matrix 


A“ 1 A=AA~ 1 = J. 


( 22 ) 


A 1 can be written in terms of the cofactor matrix , C, whose 
elements 




(23) 


More generally, a matrix is singular if a row or column is a 
linear combination of the others. 

The inverse of the matrix product AB, if AB is nonsingular 
obeys * 


(25) 


are formed from the determinants of A#, an {n - 1) x [n - 1) 
square matrix formed by deleting the i th row and ; th column 
from A. If | A | is not zero, 


A -1 = C T /| A |. 


(ABp^B^A- 1 . 

A matrix A whose transpose equals its inverse, 
a-i -at 

~ A ’ ( 26 ) 

is called orthogonal. By extension, a matrix A with complex 
elements is unitary if its adjoint and inverse are equal 

a '=/i + . ( 27 ) 

A. 4.4 Systems of linear equations 

A vector-matrix representation is often used for systems of 
linear equations. In this formulation, a system of m equations 
for n unknown variables 

a n x 1 +a 12 x 2 . . . +a ln x n = b 1 

a 2l X l +a 22 X 2 • • • +a 2n X jt = ^2 


a ml X l+a m2 X 2 " - +a mn X n = b m 
is written in the form 

n 

'L a ij x j = b i or i4x = b, 

;'=i 


(28) 


(29) 


(24) 


For the familiar n = 2 case, see problem 7. 

A matrix whose determinant is zero does not have an inverse, 
and is called singular . Because the determinant of a matrix with 
two equal rows or columns is zero, such a matrix is singular. 


by defining the matrix of coefficients and column vectors for 
the unknowns and right-hand side, 


(30) 


The coefficient matrix A is m x k, because there is one row for 
each equation, and one column for each unknown. 

The Ax = b form illustrates that whether a system of equa¬ 
tions can be solved depends on the matrix A. A system of equa¬ 
tions is called homogeneous in the special case that b = 0, and 
inhomogeneous for all other cases in which b ^ 0. We consider 
here only systems where the number of unknowns and equations 
are equal, so the coefficient matrix A is square. If A possesses an 
inverse, both sides can be premultiplied by A -1 , and 


( 

^11 ^12 
^21 a 22 * 

• ‘ a \n 

* ^2n 


( \ 
X 1 

X 2 


75 

^2 

V a m\ a m2 ' ' 

‘ ^mnj 

X = 


1! 

\^mj 


A -1 Ax = A" 1 b = lx =. 


(31) 
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yields a unique solution vector x. For inhomogeneous systems, 
computing A -1 provides a straightforward manner of solving 
for the unknown variables x 2 -. For homogeneous systems of 
equations, the equation shows that x = 0 if A -1 exists. Thus, 
for a homogeneous system to have a nonzero or nontrivial 
solution, A must be singular. This occurs if the determinant of 
A is zero, implying that some of the rows (or columns) of A are 
not linearly independent. If a nontrivial solution of the homo¬ 
geneous system exists, any constant times that solution is also 
a solution. 

If the coefficient matrix is singular, the corresponding 
inhomogeneous system of equations does not have unique 
solutions, and may have none. The existence of A -1 and the 
solvability of the equations thus depend on whether the rows 
and columns of A are linearly independent. For example, if the 
rows are linearly dependent, there are fewer independent equa¬ 
tions than unknowns and difficulties result, as discussed in the 
context of inverse problems (Chapter 7). 

A. 4.5 Solving systems of equations on a computer 

Standard methods exist to solve linear equations on a com¬ 
puter. Consider the basic problem 

Ax = b (32) 


( a ll d 12 

^13 ] 


Ai 


fM 

a 21 

a 22 

a 23 


x 2 

= 

^2 

^ a 31 

a 32 

U 33 j 


X % 

V 3 2 


b 3 

v 5 ) 


The importance of this idea is that an arbitrary matrix can 
be triangularized. Consider that the solution of the system of 
equations is not changed by any of the following elementary 
row operations : 

(i) Rearranging the equations, which corresponds to inter¬ 

changing rows in the b vector and matrix, i.e., 


/ \ 

( \ 



a ll a 12 a 13 


X 1 


h l 

<2 3 1 # 32 #33 


x 2 

— 

b 3 

v #21 a 22 a 23) 


X , 

V 


b 2 

\ 1 ) 


(37) 


The solution is unchanged because the order of the 
equations is arbitrary. 

(ii) Multiplying an equation by a constant c, which corre¬ 
sponds to multiplying a row of A and the correspond¬ 
ing element of b by a constant, i.e.. 


( \ 


( \ 


( r \ 

ca n c#i2 c#i3 1 


: 



# 21 #22 423 


x 2 

= 

h 

v #31 ^32 a 33 j 


X 3 , 
V 5 J 


b 3 

V 5 J 


(iii) Adding two equations, which corresponds to adding a 
multiple of one row to another, i.e., 


ca \\ + # 21 

c# 12 + a 22 

\ 

c #13 + a 23 


( \ 
x x 


r cb x + b 2 

a 21 

a 22 

#23 


x 2 

= 

b 2 

v 431 

a 32 

a 33 j 


x 3 , 

V '/ 


{ h J 


(39) 


in which we solve for x, given A and b. If A were a triangular 
matrix T, with zeroes below the diagonal, it would be easy to 
solve the system 

Tx = d (33) 


/ 

*n 

*12 

\ 

*13 


£ 

'- 


V 

0 

*22 

*23 


x 2 

= 

d 2 

o 

0 

*33^ 


X 3) 


d 3 

V 


by starting with the simplest (bottom) equation, solving for x 3 , 
and solving the other equations in succession to find x 2 and 
then Xj. In other words, the solution 

*3 = <V*33 

can be substituted into the middle equation to find 
x 2 = ^2~ ^23 X 3^^22' 

Then, by substituting x 3 and x 2 into the first equation, 

X 1 = (^1 ~ *13^3 ~~ *12‘*'2^*11" 


(34) 


(35) 

(36) 


Thus if the system Ax = b is transformed into Tx = d using 
elementary row operations, the two systems of equations have 
the same solutions x. This provides a fast method of solving the 
system: combine A and b into a single augmented matrix 


(A, b) 


a n 

a 21 


i \ 

a l2 a 13 
a 22 a 23 ^2 

a 32 a 33 ^3j 


(40) 


and triangularize the augmented matrix to obtain 


(T, d) = 


'12 *13 


d t 


0 t 
0 0 


22 C 23 ^2 


t 33 ^3 


(41) 


which represents a set of equations easily solved for x by the 
method in Eqns 34-6. 

The matrix is triangularized using the following method 
column by column: 

• Find the element of maximum absolute value in the 
column on or below the diagonal. 

• If this “pivot” element is below the diagonal, interchange 
rows to get it on the diagonal. 
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• Subtract multiples of the pivot row from rows below it to 
get zeroes below the diagonal. 

The pivoting, though not absolutely necessary, avoids possible 
numerical difficulties. Note that once a column is zeroed below 
the diagonal, we do not have to think about it any more. 

For an illustration of this method, called Gaussian elimi¬ 
nation with partial pivoting , consider solving the system of 
equations 


x x + x 2 = 5 , 


4x 1 + x 2 + x 3 = 4, 
2x 1 + 2x 2 + 2x 3 = 3. 


This can be expressed in matrix form as 


1 1 0 
4 1 1 
2 2 2 


( \ 




V 


v 3 . 


(42) 


(43) 


and solved by triangularizing the augmented matrix 


'110 5 ^ 

4 114 

2 2 2 3 
v j 


(44) 


To get zeroes below the diagonal in the first column, we first 
move 4, the element with the largest absolute value in the first 
column, to the diagonal by interchanging rows 


4 114 
110 5 
2 2 2 3 


We then subtract 1/4 times the first row from second, and 1/2 
times the first row from third, leaving 


4 1 

0 0.75 

0 1.5 


1 4 

-0.25 4 
1.5 1 


Next, to zero the elements below the diagonal in the second col¬ 
umn, we interchange rows to get the pivot for this column, 1.5, 
on the diagonal: 


4 1 

0 1.5 

0 0.75 


1 4 

1.5 1 
•0.25 4 


(47) 


and subtract 0.75/1.5 = 0.5 times the second row from the third 

5 


4 1 

0 1.5 

0 0 


1 4 

1.5 1 
-1 3.5 


( 48 ) 


to complete the triangularization. We then solve the equations 
for x, beginning with the bottom one, as in Eqns 34-6. 

A similar procedure can be used to invert a matrix. This 
method uses the idea that two vector-matrix equations 


Ax = b and Ay = c 


(49) 


can be combined into one by forming an augmented matrix 
from each pair of vectors, 


X=(x,y), B = { b,c), 

and writing the matrix equation 

AX = B. 


(50) 


(51) 


Because x, the solution to Ax = b, is not changed by elementary 
row operations on the augmented matrix (A, b), the corres¬ 
ponding solution to AX = B is unaffected by elementary row 
operations on the augmented matrix (A, B). 

To apply this to matrix inversion, consider a special case 


AX = 7, 


(52) 


whose solution X = A -1 is the inverse of the nxn matrix A. X 
is unaffected by elementary row operations that convert the 
augmented matrix 



(A, I) = 

f 

a n ■ 

■ a ln 1 • 


(45) 


K U nl ' 

. a„„ 0 . 

• i, 


(53) 


to one whose left side is the identity 



(I, B) = 


. 0 

b n . 

• C 

(46) 


: 

. 1 

bnl • 

* ^nn ) 


(54) 


so the corresponding equation 
IX = B 


{55, 


shows that the right side of the matrix gives B = X = A -1 , the in¬ 
verse of A. The sequence of operations used to diagonalize the 
left (A) side of the augmented matrix (A, 7) are similar to those 
that triangularize a matrix. 


A.5 Vector transformations 

In seismology, we often apply two types of transformations 
to vectors. In the first, the same vector is expressed in two 
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Fig. A.5-1 The relation between two orthogonal coordinate systems with 
the same origin is described by the angles <x between the two sets of axes. 


different coordinate systems. In the second, some operation 
converts a vector to another vector expressed in the same co¬ 
ordinate system. In this section we summarize these transforma¬ 
tions and their differences. 


Given the components u { in the unprimed system, the com¬ 
ponents u ■ in the primed system are found by taking the scalar 
products of the vector with the basis vectors of the primed 
system: 

u\ = i\ • u=(e / 1 • + • e 2 )u 2 + {e / 1 • e 3 )w 3 

— ^ 11^1 ^ 12^2 


tl ^ ’ U — ^31^1 ^32^2 ^ ^33^3* 

These can be written as a matrix equation 


Ul w 12 u 13 


u' = Au, or 


where A is the matrix that transforms a vector from the 
unprimed to the primed system. Note that this is not a relation 
between two different vectors u and u' — it is a relationship 
between the components of the same vector in two coordinate 
systems. It turns out that the matrix A uniquely describes the 
transformation between these coordinate systems. 

For example, a unit basis vector for the unprimed system 

tq = le 3 + 0e 2 + 0e 3 = (1, 0, 0) (5) 

has components in the primed system given by 


A.S. 1 Coordinate transformations 

We have seen that vectors remain the same regardless of the co¬ 
ordinate system in which they are defined, although their com¬ 
ponents differ between coordinate systems. Thus vectors can 
be defined in one coordinate system (for example, one oriented 
along an earthquake fault plane) and reexpressed in another 
(such as a geographic coordinate system). This property is very 
useful for solving problems and gives valuable insight into the 
nature of vectors. 

To define the relation between vector components and co¬ 
ordinate systems, consider two orthogonal Cartesian coordinate 
systems (Fig. A.5-1). Because the origins are the same, one co¬ 
ordinate system can be obtained by rotating the other through 
three angles. The relation between the two sets of unit basis 
vectors, e v e 2 , e 3 and e j, e 2 , e 3 , is given by their scalar products, 
called direction cosines , 


e.- e/ = cos a if = a ip ( 1 ) 

where the angles a are the angles between the two sets of axes. 

A vector can be expressed in terms of its components in the 
two coordinate systems 


“11 “11 “12 “13 * 

^21 = a 21 a 22 a 23 0 


51 “32 “3; 


and so is written 

a n t t -b a 21 e 2 T ^ 2 3 3 e 3 — (^q 3 , $ 23 , ^ 33 ) (7) 

in the primed system. The last expression is just the first column 
of A. Similarly, the components of e 2 and e 3 in the primed 
system are the second and third columns of A, respectively. 
Thus the columns of the transformation matrix A are the basis 
vectors of the unprimed system written in terms of their com¬ 
ponents in the primed system. 

For example, consider rotating a Cartesian coordinate sys¬ 
tem by 0 counterclockwise about the e 3 axis, so that the only 
rotation occurs in the plane. The e 3 axis is also the e 3 axis 
(Fig. A.5-2). The elements of the transformation matrix are 
found by evaluating the scalar products of the basis vectors 
a ij = so 

a n = e'i * e 3 = cos fi, tf 12 = ej • e 2 = cos (90°- 0) = sin 0 , 

a 22 = e 2 • e 2 = cos 0 , a 21 = e 2 • eq = cos (90° + 6) = -sin 9, 

a 33 = ^3 * ^3 ~ a l$~ a 23~ a 31~ a 32~^’ 


( 8 ) 
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Fig. A.5-2 The relation between the axes of two orthogonal coordinate 
systems differing by a rotation 6 in the x 1 -x 2 plane. 

and the components of a vector in the two systems are related 
by 

f u[\ f cos 9 sin 6 

u' 2 = -sin 0 cos 6 0 u 2 . (9) 

uZ 0 0 1 

Thus the e : and e 3 , and the e 2 and e 2 components differ, 
whereas the e 3 and e 3 components are the same. To check this, 
consider the case where 0= 90°. As expected, (1, 0, 0) in the 
unprimed system becomes (0,-1, 0) in the primed system, and 
(0,1, 0) in the unprimed system becomes (1, 0, 0) in the primed 
system, while (0, 0,1) in the unprimed system remains (0, 0,1) 
in the primed system. 

Seismologists often use such a geometry. Because the ground 
motion is a vector, seismometers are generally oriented to 
record its components in the east-west, north-south, and 
up-down directions. This decomposition is less useful than 
decomposing ground motion into its radial and transverse 
components, those along and perpendicular to the great circle 
connecting the earthquake and seismometer. The vertical com¬ 
ponent is useful as is, so a rotation about the vertical by the angle 
between East and the great circle connecting the earthquake 
and seismometer converts the E-W and N-S components into 
the new representation. The relevant angle, the back azimuth 
to the source from the receiver, is discussed in Section A.7.2. 

We can also reverse the transformation. By analogy to Eqn 3, 
the components in the unprimed system can be found from 
those in the primed system as 

u 1 =c l - iT^ej • + • c'^u^ + i^ • e 3 )w 3 

— j t a 2 -pu 2 T a 2 ^u 


u 2 — e 2 * u — a-^ 2 u a 22 u 2 T a 22 u 3 , 

u 2 — e 3 • u = a-^ 2 u^ a 22 u 2 -\~a 22 u 


Combining these to express the reverse transformation in 
vector-matrix form, 



shows that the reverse transformation matrix is just the trans¬ 
pose of the transformation matrix A 

u = A t u'. (12) 

Hence a unit basis vector in the primed system 

c\ - le'j + 0e 2 + 0e 3 (13) 

becomes, by the matrix transformation, 

^ 11 e 1 + <3 12 e 2 + ^ 13 e 3 (14) 

in the unprimed system. This is the first row of A, so the rows of 
A are the primed basis vectors expressed in the unprimed coor¬ 
dinates. This is natural because the transformations are related 
by the matrix transpose. 

Alternatively, the reverse transformation can be found 
directly by starting with u' = Au and multiplying both sides by 
the inverse matrix 

A- 1 u / = A" 1 Au = Iu = u. (15) 

Comparison with Eqn 12 shows that the inverse of the trans¬ 
formation matrix equals its transpose, so the transformation 
matrix is an orthogonal matrix. This seems reasonable because 
the columns of A, which represent orthogonal basis vectors, 
are orthogonal. Similarly, the rows of A are orthogonal. As a 
result, such coordinate transformations are called orthogonal 
transformations. An important feature of orthogonal trans¬ 
formations, whose proof is left as a homework problem, is that 
they preserve the length of vectors. 

The transformation relations, Eqns 4 and 12, provide a 
mathematical definition of a vector. Any vector must transform 
between coordinate systems in this way. A set of three entities 
defined at points in space (for example, temperature, pressure, 
and density) that does not obey the transformation equations is 
not a vector. 

A.5.2 Eigenvalues and eigenvectors 

The product of an arbitrary n x n matrix A and an arbitrary 
^-component vector x 
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is also a vector in n dimensions. This is not the same as co¬ 
ordinate transformation; the vector x is transformed into 
another distinct vector, with both vectors expressed in the same 
coordinate system. 

A physically important class of transformations convert a 
vector into one parallel to the original vector, so that 

Ax = Ax, (17) 

where A is a matrix, and A is a scalar. The only effect of the 
transformation is that the length of x changes by a factor of A. 
For a given A, it is useful to know which vectors x and scalars A 
satisfy this equation. 

In three dimensions, the case most commonly encountered, 
Eqn 17 can be written 
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The components of the eigenvector, x^f\ x^ 2 \ are found 
by solving 


^11 a n 




a ii A. m a 2 . 


a o-i a 


33 ~ \n \ X 3 


Each eigenvalue and its associated eigenvector form a pair 
satisfying Eqn 22. In general, an eigenvalue and the eigen¬ 
vector associated with a different eigenvalue will not satisfy the 
equation. 

For example, the eigenvalues of 

"3-1 (T 

A= -1 2-1 (24) 

0-13 


wM 

m 


(A- AI)x = 0 


xL ^12 

a l3 

X 1 

0 


2 i a 22 - ^ 

a 23 

x 2 = 

0 . 

(18) 

31 a 32 

a 33 ~ ^ , 

*3, 

0 

V 



are found by solving the characteristic polynomial 

A 3 - 8A 2 + 19A-12 = 0, (25; 

whose roots are X 1 = 4, A 2 = 3, A 3 = 1. Next, the equations 


■ 


This is a homogeneous system of linear equations, so nontrivial 
solutions exist only if the matrix (A - XI) is singular. We thus 
seek values of A such that the determinant 




a^ j A 


■ 


i (A — AT) 1 = det a. 


'32 u 33 


Evaluating the determinant gives the characteristic polynomial 

A 3 -I 1 A 2 + J 2 A-/ 3 = 0, (20) 

which depends on three constants called the invariants of A: 

1 1 = a n + a 22 + ^335 




1 2 = det a12 

.a~i 1 ij-io 


T = det A. 


det " 22 * 23 + det Un ^ 


1\, the first invariant, or trace , of A, is the sum of the diagonal 
elements of A. The invariants of a matrix have significance for 
stresses, strains, and earthquake moment tensors, because they 
are not changed by orthogonal transformations. 

The characteristic polynomial is a cubic equation in A 
with three roots, or eigenvalues , X m for which the determinant 
| A - A/ | is zero. For each eigenvalue there is an associated non¬ 
trivial eigenvector , x^ m \ satisfying 

A x M = a x (m) . (22) 


■1 2 - A* -1 4 m) 

) -1 3-/L x<r> 


are solved for each eigenvalue to yield the associated eigen¬ 
vector. Thus for A 3 = 1, 

2x^ — x^ = 0, 

-X^+xf-xf=0, 

-xf + 2xf=0. (27) 

All three unknowns cannot be found uniquely, because these 
are homogeneous equations. We thus set x^ equal to 1 and 
find the other two unknowns, x^} = 2, x| 3) = 1. Similarly, the 
other eigenvectors are found by substituting A 2 and X t in 
Eqn 26, so 

x< 3 > = (1,2,1), x< 2 > = (1,0,-1), x (1) = (1,-1,1). (28) 

Because the eigenvectors are solutions to a set of homo¬ 
geneous equations, any multiple of an eigenvector is also an 
eigenvector. The eigenvectors thus determine a direction in 
space, but the magnitude of the vector is arbitrary. Often the 
eigenvectors are normalized to unit magnitude. The set we have 
found can be written as 

xm = (1/^3, -1/^3,1/-/3), x< 2 > = ( 1 /V 2 , 0 , - 1 /V 2 ), 

x< 3 > = (lA/6,2/^6,1/76). (29) 


Sometimes complications arise, as for the matrix 


( 35 ) 


f l 0 o' 

A= 0 0 0 (30) 

0 0 1 

v / 

with eigenvalues 1,1, and 0. Using the method given above to 
find the eigenvector for 1 3 = 0 by setting x^ = 1 yields no solu¬ 
tion. Setting = 1, however, yields a correct solution for the 
eigenvector, (0, 1, 0). Because this has no e 1 component, we 
could not have set x^ = 1 and found the other components. 

This example illustrates a complication that arises for a de¬ 
generate , or repeated, eigenvalue: e.g., X t - X 2 = 1. In this case, 
the eigenvalue corresponds not to an eigenvector but to an 
entire plane, and any vector contained within it is an eigenvector. 
Two eigenvectors spanning this plane can be found by finding 
the eigenvector of the nondegenerate eigenvalue, and then 
choosing two independent vectors orthogonal to it. Because the 
eigenvector for the nondegenerate eigenvalue is (0, 1, 0), two 
possible orthogonal eigenvectors for the degenerate eigenvalue 
are (1,0,0) and (0, 0,1). 

A.5.3 Symmetric matrix eigenvalues , eigenvectors , 
diagonalization , and decomposition 

The eigenvalues and eigenvectors of a symmetric matrix have 
interesting properties. An nxn matrix H has a characteristic 
polynomial of degree «, each of whose n roots is an eigenvalue. 
Consider two eigenvalues and their associated eigenvectors 

Hx® = Hx® = Xjx®. (31) 

Multiplication of the first equation by x^ T (the transpose of 
x03) and the second equation by yd t)T yields 

xOTjFfx^ - x ( ^ T Hx01 = A ; x { 0 T x (/). (32) 

Transposing both sides of the second part of Eqn 32 and sub¬ 
tracting it from the first gives 

(33) 

Because H is symmetric, it equals its transpose, H = H T , so the 
left-hand side is zero 

0 = (A f -A 7 .)x<^xl f l. (34) 

Thus, if i & j and the two eigenvalues are different, their asso¬ 
ciated eigenvectors must be orthogonal so that their scalar 
product xbl T x ( d is zero. Thus, for a symmetric matrix, eigen¬ 
vectors associated with distinct eigenvalues are orthogonal. 

This result lets us diagonalize a symmetric matrix. To illus¬ 
trate this for a 3 x 3 case, consider a matrix U whose columns 
are the eigenvectors of the symmetric matrix H 


v (l) 

X 1 

r (2) 
Ar | 


y (D 
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v {2) 

X 2 

xf 
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(2) 

(3’ 

x 3 

X 3 

xf 


If the eigenvalues of H are distinct, the eigenvectors of H, and 
hence the columns of the eigenvector matrix, are orthogonal, 
so U is an orthogonal matrix satisfying U _1 = U T . 

The entire set of eigenvalue-eigenvector pairs, each of which 
satisfy Hx (?) = A z x (/) , can be written as the matrix equation 

HU = UA, (36) 

where A is the diagonal matrix with eigenvalues on the diagonal 

"A, 0 0' 

A = 0 A, 0 . (37) 

0 0 A 3 

V V 

Premultiplying both sides of Eqn 36 by the inverse of the 
eigenvector matrix yields 

U~ 1 HU=U T HU = A, (38) 

which shows how the eigenvector matrix can be used to 

diagonalize a symmetric matrix. This result can also be stated 
as 

H = UAU r , (39) 

which illustrates how a symmetric matrix can be decomposed 
into a diagonal eigenvalue matrix and the orthogonal eigen¬ 
vector matrix. Similar results apply for complex Hermitian 
matrices. 

We will see that if a matrix contains the components of 
vectors expressed in a coordinate system, the physical problem 
under discussion can be simplified by diagonalizing the matrix. 
This corresponds to rewriting the problem in its “natural” co¬ 
ordinate system, whose basis set is the eigenvectors, an idea 
used in discussing stresses in the earth (Section 2.3.4) and the 
seismic moment tensor (Section 4.4.5). 


A.6 Vector calculus 

A. 6.1 Scalar and vector fields 

Many phenomena in seismology depend on how physical 
quantities vary in space. Some, like density or temperature, 
are scalar fields, scalar valued functions of the position vectoi x 
denoted by expressions like 0(x) or (p{x v x 2 , x 3 ). Similarly, 
a vector that varies in space is described by a vector field . Foi 
example, seismic waves are described by the variation in the 
displacement vector 







u(x) = u(x 3 ,x 2 ,x 3 ) 

= u^x^ x 2 , x^cq + u 2 {x v x 2 , x 3 )e 2 + m 3 (x 1} x 2 , -^ 3)^3 


( 1 ) 


; as a function of position, and result in turn from forces derived 
f from spatial derivatives of the stress tensor. 

Spatial variations of scalar, vector, or tensor fields are de¬ 
scribed using the vector differential operator “del”, V, 


V = 


v 


3 3 3 

dx 3 2 dx 2 3 dx 3 ^ 


( 2 ) 


This operator has the form of a vector, but has meaning only 
when applied to a scalar, vector, or tensor field. We first review 
J uses of the V operator in Cartesian coordinates, and in the 
| next section discuss the more complicated forms for spherical 
; coordinates. 


A.6.2 Gradient 

The simplest application of the V operator is the gradient , 
! a vector field formed from the spatial derivatives of a scalar 
field. If 0(x) is a scalar function of position, the gradient is 
defined by 


grad 0(x) = V0(x) = 


30(x) 

3 x t 1 


Mg + 

dx 2 2 


dftM * 

dx 3 35 


(3) 


where 30(x)/3x 1 is the partial derivative of 0 (x 1? x 2 , x 3 ) with 
respect to x v for x 2 and x 3 held constant. The gradient is a 
vector field whose components equal the partial derivative with 
respect to the corresponding coordinate. 

Expressions like Eqns 1 and 3 can be written more compactly 
if the dependences on position are not written explicitly, i.e., 


V 0 


30 

3x 3 


30 A 

__JL .£ 2 + 
dx 2 


30 A 
3^ 63 


(4) 


In this notation, it is implicit that 0 , its derivatives, and hence 
the gradient, vary with position. 

For example, the elevation 0 (x 1? x 2 ) is a scalar field de¬ 
scribing the topography as a function of position in a two- 
dimensional region. This is often plotted using topographic 
contours (Fig. A.6-1), curves along which 0 is constant. At any 
point, 30/dxj is the slope in the x 1 direction, and 30/3x 2 is the 
slope in the x 2 direction. 

The gradient can be used to find the slope in any direction. 
The projection of a vector in a given direction is the scalar 
product of the vector and the unit normal vector in that direc¬ 
tion, n = (n t , n 2 ). Thus the scalar product of the gradient with 
the normal vector, 


r-r ys 

n • V 0 = n t -+ n 2 

3x 2 


3x, 


( 5 ) 
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Fig. A.6-1 A scalar field demonstrating the concept of a gradient. 

If <p(x v x 2 ) gives the elevation, the gradient can be used to find the slope 
in the n direction at a point (aq, x 2 )- 

gives the directional derivative in the n direction. Because both 
n and V 0 are functions of position, the directional derivative 
varies in space. At any point, the maximum value of the scalar 
product occurs for n parallel to the gradient, so the gradient 
points in the direction of the steepest slope along which 0 
changes most rapidly. The scalar product is zero when n is 
perpendicular to the gradient, so the gradient is perpendicular 
to curves of constant 0 . These concepts are also used in three 
dimensions. 

In index notation, the gradient is written as 


(V0)-= -2- = 0 •, ( 6 ) 

ox t 

where the last form uses a common (if sometimes confusing) 
notation in which differentiation is indicated by a comma. The 
notation, with one free index, shows that the gradient is a vec¬ 
tor. By contrast, the directional derivative, written as 

n- y ( f)= n M-=n i (f )i , (7) 

ox i 

has an implied sum over i and is a scalar. 

Often, the gradients of quantities are important physically 
because an effect depends on spatial variations of a field. For 
example, the flow of heat depends on the gradient of the tem¬ 
perature field (Sections 5.3.2, 5.4.1), and the gradient of the 
pressure field in the atmosphere is important for the weather. 


A. 6.3 Divergence 

A related operation that describes the spatial variation of a vec¬ 
tor field is the divergence. The divergence of a vector field u(x) 
is given by the scalar product of the V operator with u(x) as 


3 U-y KJVl 2 UVVn 


du 2 + 3 U' 
3x a 3x 2 3x 3 




( 8 ) 
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Fig. A.6-2 The divergence, formed from the differences between the flow 
into one face of a volume and the flow out of the opposite face, gives the 
net flow through a unit volume. 


which yields a scalar field because the vector components and 
their derivatives are functions of position. 

The divergence frequently arises in conservation equations. 
For example, if u(x) is the velocity as a function of position in a 
fluid, V • u{x) gives the net outflow of material per unit time 
from a unit volume at position x (Fig. A.6-2). To see this, note 
that, to first order, the net flow in the x 2 direction is the differ¬ 
ence between the flow out the far side, u 2 + du 2 /dx 2 , and that 
into the near side, « 2 , given as 

du 7 du 0 

U 2 + -±-U2 = -±. (9) 

OOC 2 

Adding similar terms for the net flow in the x 1 and x 3 directions 
gives the divergence (Eqn 8). If the divergence is positive, there 
is a net outward flow, whereas a negative divergence indicates a 
net inflow. 

This idea can be applied to any vector field u(x). Consider the 
problem of finding the net outflow from a region with volume 
V and surface S. If n(x) is the unit normal vector pointing 
outward at a point x on the surface (Fig. A.6-3), the scalar 
product n(x) • u(x) gives the outward flux per unit area at that 
point. Integrating the flux over the surface then gives the total 
flux. Another way to compute the total flux is to integrate the 
divergence over the volume. These two methods give the same 
flux, so 


n • u dS - 


V • u dV. 


s 


j 

V 


( 10 ) 


This relation, Gauss's theorem , or the divergence theorem , 
says that what accumulates inside a volume is determined by 
the integral over its surface of what goes out. If we think of the 
volume as many adjacent cells, the flow out of one cell is the 



Fig. A.6-3 Geometry for the divergence theorem: n(x) is a unit vector 
pointing outward at the point x from an element dS of the surface S that 
encloses a volume dV. 


ill 


flow into an adjacent cell, which cancels to zero. Only flow i 
or out of the volume’s surface is not canceled out in this way. 
Written in full, jdV is a triple integral over the volume, and jdS 
is a double integral over the surface. 

In index notation, using the summation convention, the 
divergence is written 


du. 


(ID 


which is a scalar because no free index remains. Gauss’s theo¬ 
rem is written 


u-n t dS = 


du- 

dx- 


dV, 


s v 

or, using the comma notation for derivatives, 


(F 


u^dS- 


u udV . 


(13) 


As before, it is implicit in the notation that the field u, its derivat¬ 
ives, and the normal vector n vary with position. 


A.6.4 Curl 

The curl operator, the cross product of the V operator with a 
vector field, yields another vector field 


V x u = q 


f du 3 du 2 ^ 

+ e-i 

f du l 

du 3 ^ 

+ Co 

f du 2 du { 

y dx 2 dx 3 j 

z 

[dx 3 

axj 

3 

K dx 1 dx z , 

(T 
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t(x) 



and therefore cancel. The segments of the line integrals cancel 
between all the tiles except those on the outer border that have 
no adjacent circulation to cancel them. 

If the line integral is nonzero, the vector field has a net rota¬ 
tion along the curve, so the integral of its curl over the surface is 
nonzero. The curl of a vector field shows where rotations arise. 
A common application is describing the velocity field of a mov¬ 
ing fluid. The upper portion of Fig. A. 6-5 shows streamlines, 
lines parallel to the velocity vector at any point, for a viscous 
fluid flowing past a circular object. The velocity is zero at the 
object, and increases with distance away from it. The flow is 
symmetric on the bottom of the object. The lower portion of 
the figure shows contours of the curl of the velocity field with 
larger values, indicating greater rotations, close to the object. 

Two useful identities, whose proofs are left for the problems, 
are that the curl of a gradient and the divergence of a curl are 
zero: 


V * (Vxu) = 0 
Vx(V» = 0. 


(18) 

(19) 


Fig. A.6-4 Geometry for Stokes’ theorem: n(x) is a unit vector pointing 
outward at the point x from an element dS of the surface S. dC is an 
element of the curve C bounding S, with tangent fix). 

This can be written as a determinant 


V x u = det 


_3_ 

3 x 1 


e 2 

_3_ 

dx^ 


e 3 

_3_ 

dx~ 


l 3 ) 


(15) 


or, using index notation, in a compact form as 


V x u = £ 




e ak u k,r 


(16) 


Some physical insight into the curl comes from Stokes’ 
theorem , which relates the integral of the curl of a vector field 
over a surface S to the line integral around a curve C bounding 
S (Fig. A.6-4) as 


tdC = 


(Vxu) - hdS. 


(17) 


Equation 19 can be used with Stokes’ theorem to show that for 
a vector field written as the gradient of a scalar, the curl, and 
hence circulation around an arbitrary curve, are zero. This idea 
is used in mechanics to prove that a conservative force (one that 
can be written as the gradient of a potential) has a line integral 
that is independent of path, because its circulation around any 
path is zero. These relations give insight into seismic waves, 
because P waves have no curl and S waves have no divergence 
(Section 2.4.1). 

A.6.S Laplacian 

The Laplacian operator is formed by taking the divergence of 
the gradient of a scalar field, which yields a scalar field 

V 2 0 = V^ 


3 2 $> 3 2 0 d 2 (f> 

dx\ 


dx 2 


dx 2 


( 20 ) 


where the last form uses index notation and the summation 
convention. By analogy, the Laplacian of a vector field is a vec¬ 
tor field whose components in Cartesian coordinates are the 
Laplacians of the original vector components, 


V 2 u = (V 2 m 1? V 2 u 2 , V 2 u- 


( 21 ) 


Here dS is an element of surface area with normal n(x), and dC 
is an element of the curve with tangent t(x). Analogous to the 
case of Gauss’s theorem applied to a volume, we can think of 
the surface as composed of infinitesimal tiles, each with a line 
integral of u • t around it. The border of each tile is shared with 
another tile, but, because the line integral, or circulation , is 
computed in a counterclockwise manner, the integrals along 
this border are the same but of opposite sign for the two tiles, 


For example, the e 2 component of V 2 u is 


d 2 u, 3 2 zq 3 2 u 1 

1 +-1 + i. 


3x 2 


3x| 


dx\ 


( 22 ) 


In Cartesian coordinates, the Laplacian of a vector satisfies 
V 2 u = V(V • u) -Vx(Vxu), (23) 
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an obscure-looking relation that is useful in deriving the exist¬ 
ence of P and S waves. 

A. 7 Spherical coordinates 

The vector operations discussed so far were performed in 
Cartesian coordinates, in which the unit basis vectors (6^ e 2 , e 3 ) 
point in the same direction everywhere. There are, however, 
situations in which non-Cartesian coordinate systems without 
these nice properties are useful. In particular, spherical coordin¬ 
ates often simplify the solution of problems with a high degree 
of symmetry about a point. 

A. 7.1 The spherical coordinate system 

In a spherical coordinate system, a point defined by a position 
vector x is described by its radial distance from the origin, r = 
| x |, and two angles. 9 is the colatitude , or angle between x and 
the x 3 axis, and 0, the longitude , is measured in the x 1 -x 2 plane. 
Often the latitude , 90° - 6, is used instead of the colatitude. 
Spherical coordinates are often used in seismology because 
the earth is approximately spherically symmetric, varying with 
depth much more than laterally. Thus properties like velocity 
and density are often approximated as functions only of r, inde¬ 
pendent of 6 and 0. 

Figure A.7-1 shows the relations between rectangular and 
spherical coordinates. If the vector x is written as 

x = x 1 e 1 +x 2 e 2 + x 3 e 3 , (1) 

then its components in rectangular coordinates (x v x 2 , x 3 ) are 
described by spherical coordinates as 


Fig. A.6-5 Top: streamlines showing 
the velocity of fluid flow around an 
object. Numbers on streamlines show the 
magnitude of the velocity. Bottom: contours 
of the curl for this velocity field. The curl is 
greatest near the sphere, where the fluid flow 
lines are the most curved. (After Batchelor, 
1967. Reprinted with the permission of 
Cambridge University Press.) 


North pole 



Fig. A.7-1 Relations between spherical ( r , 6, 0) and Cartesian coordinates 
{x v x 2 , x 3 ). (After Marion, 1970. From Classical Dynamics of Particles 
and Systems , 2nd edn, copyright 1970 by Academic Press, reproduced by 
permission of the publisher.) 



( \ 
X 1 


r sin 9 cos (j? 

X = 

x 2 

= 

r sin 9 sin 0 


X 3 

\ 5 ) 


r cos 9 

\ / 


Conversely, the spherical coordinates r, 6, and 0can be written 
as 

r-(xf + x 2 + x 3 ) 1/2 , 6=cos~ 1 (x 3 /r), 0 = tan _1 (x^/xj)- CT 

In the equatorial (xy-x^ plane, 9 = 90°, cos 9- 0, sin 9= U so 
Xj = r cos (p,x 2 = r sin 0, and x 3 = 0. This is the same as the polar 






A.7 Spherical coordinates 463 



Fig. A.7-2 Geometry of the latitude and longitude system used to locate 
points on the earth’s surface. A point P at 50°N, 60°W (0=40°, 0 = -6O°) 
is shown. (After Strahler, 1969 .) 

coordinate system described in Section A.3.1. Along the x 3 axis 
we have 0 = 0°, so x 1 =x 2 = 0, and x 3 = r. Any of these expres¬ 
sions written in terms of colatitude 0 can be converted to 
latitude X = 90° - 0, using cos 0= sin X and sin 0= cos X. 

This coordinate system is the familiar one (Fig. A.7-2) used 
to locate points within the earth or on its surface, r = a. For this 
purpose, the origin is placed at the center of the earth, and the 
x 3 axis is defined by a line from the center of the earth through 
the north pole. The intersections of planes containing the x 3 
axis with the earth’s surface define meridians, lines of constant 
longitude. The x x axis intersects the equator at the prime 
meridian , on which 0 is defined as zero, which has been chosen 
to run through Greenwich, England. The intersection of planes 
perpendicular to the x 3 axis with the earth’s surface define 
parallels, lines of constant colatitude or latitude. Meridians are 
a special case of great circles, lines on the surface defined by the 
intersection of a plane through the origin with the surface of 
the spherical earth. Parallels are a special case of small circles, 
which are lines on the surface defined by the intersection of the 
surface of the spherical earth with a plane normal to a radius 
vector. 

These conventions allow the colatitude 0 (0° < 0 < 180°) 
and longitude 0(O°<0<36O°)to define a unique point on the 
earth’s surface. Often locations are described in terms of 
latitudes north and south of the equator, and longitudes east 
and west of Greenwich. North and south latitudes corres¬ 
pond, respectively, to colatitudes less than or greater than 90°. 
Because 0 measures longitude east of the prime meridian, west 


longitudes correspond to values of 0 less than 0° or greater than 
180°. Thus a point at (10°S, 110°W) has 0=90° + 10° - 100°, 
and 0 = -llO° = 36O°- 110° = 250°. 

At any point, unit spherical basis vectors (e f , t e , e^) can be 
defined in the direction of increasing r, 0, and 0. e r points away 
from the origin, and gives the upward vertical direction. e e 
points south, and points east. These two are sometimes writ¬ 
ten in terms of north- and east-pointing unit vectors, e NS = -e 0 
and e EW = ty 

An important feature of the unit spherical basis vectors is 
that at different points they are oriented differently with re¬ 
spect to the Cartesian axes. The Cartesian unit basis vectors 
(e a , e 2 , e 3 ) point in the same direction everywhere. By contrast, 
for example, e r points in the e 3 direction at the north pole, and 
in the ~e 3 direction at the south pole. This effect is described by 
the Cartesian (e a , e 2 , e 3 ) components of the unit spherical basis 
vectors, at a point with colatitude 0 and longitude 0: 


( * M 
-sm 0 


f \ 

COS 0 COS 0 


(■ a a! 

sm 0 cos 0 

COS 0 

> ~ 

cos 0 sin 0 

> K = 

sin 0 sin 0 

0 j 


, -sin0 , 


v cos 0 j 


The dependence on the colatitude and longitude describes how 
the orientation with respect to the Cartesian axes changes. 

At any point, the spherical basis vectors (e r , e e , e^) form an 
orthonormal set. For problems whose spatial extent is small 
enough that the curvature of the earth can be ignored, these 
basis vectors provide a useful local coordinate system. 

A. 7,2 Distance and azimuth 

Spherical coordinates are especially useful in describing the 
geographic relation between two points on the earth’s surface. 
A common application is to find the distance between points 
and the direction of the great circle arc joining them. A great 
circle arc is the shortest path between points on a sphere, so if 
seismic velocity varies only with depth, the fastest path along 
the surface is the great circle arc, and the fastest paths through 
the interior are in the plane of the great circle and the center 
of the earth. Because velocities vary laterally by only a few 
percent throughout most of the earth (and imperceptibly in 
the liquid outer core), this is a good approximation for most 
seismic applications. The source-to-receiver distance is often 
given in terms of the angle A subtended at the center of the earth 
by the great circle arc between the two points (Fig. A.7-3). If 
A is expressed in radians, then the length s (in km) of the arc 
along the earth’s surface is RA, where R is the earth’s radius 
(» 6371 km). If A is expressed in degrees, s = RA^/180, so one 
degree of arc equals 111.2 km. 

Consider the great circle arc connecting an earthquake 
whose epicenter is at (0 E , 0 E ) and a seismic station at {0 S , 0 S ). 
Seismic waves that traveled along the great circle arc (or in the 
plane of this arc and the center of the earth) left the earthquake 
in a direction given by the azimuth angle f measured clockwise 
from the local direction of north at the epicenter to the great 
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e e 


Fig. A.7-3 Geometry of the great circle path 
between an earthquake epicenter and seismic 
station (left), showing the convention for 
defining the azimuth, £ (right). 


circle arc. These waves arrive at the seismometer from a direc¬ 
tion described by the back azimuth angle measured clock¬ 
wise from the local direction of north at the seismometer to the 
great circle arc. To find these quantities, the Cartesian compon¬ 
ents of the position vectors for the earthquake and the station 
are written, using Eqn 2: 



R sin 0 E cos (j) E ^ 


R sin 0 S cos (f 

X E = 

R sin 0 E sin (p E 

x s = 

R sin 0 S sin (j) s 


R cos (j) E 


, R cos e s ) 


The distance A, the angle between x s and x E , is given by the 
scalar product 

x s • x E = R 2 cos A, (6) 

so 

A = cos _1 [cos 0 E cos 0 s + sin 0 E sin 0 S cos (</> s - (f) E )]. (7) 

This formula defines A uniquely between 0 and 180°. This 
shorter portion of the great circle is called the minor arc con¬ 
necting the two points; the longer portion, known as the major 
arc , is (360° - A) degrees long. 

To compute the azimuth from the earthquake to the station, 
consider b, a unit vector normal to the great circle in the local 
horizontal plane at x E , which is written using the vector prod¬ 
uct of the position vectors 

x s x x E = bjR 2 sin A. (8) 


sin 6 S cos 0 E sin (j) s - sin 0 E cos 0 S sin (f> E 
b = ——— cos 0 S sin 0 F cos (f> E - cos 0 E sin 0 S cos (j) s . (9) 

^ sin 0 S sin 0 E sin (p E - (j) s ) y 

The azimuth angle £, measured clockwise from north, is then 
given (Fig. A.7-3) by 

, a 1 

cos f=b • e,=-(cos 0 S sin 0 E -sin 0 S cos 0 E cos (0 S - (j) E )) 

sin A 

( 10 ) 

and 

- 1 

sin f=b • e 0 =-sin 0 S sin (0 S -0 E ). (11) 

sin A 

Use of both sin f and cos f makes the angle f unambiguous 
(0° < f < 360°). The azimuth from an earthquake to a receiver is 
useful, because earthquakes radiate more energy in some direc¬ 
tions than in others (Chapter 4), so measurements at different 
azimuths yield information about the source. 

The back azimuth J', obtained by reversing the indices E and 
S in Eqns 10 and 11, shows the direction from which seismic 
energy arrives at a seismometer. Seismometers typically record 
the north-south and east-west components of horizontal 
ground motion. Using the back azimuth, these observations 
can be converted into radial (along the great circle path) and 
transverse (perpendicular to the great circle path) components 
by a vector transformation (Eqn A.5.9). This distinction is 
made because waves appearing on these components propag¬ 
ated differently (Section 2.4). The azimuth and back azimuth 


Evaluation of the vector product gives 
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Fig. A.7-4 Geometry of the great circle path for an earthquake in the Peru 
trench recorded at station VAL (Valentia, Ireland). The azimuth, £, and 
back azimuth, are not simply related, due to the sphericity of the earth. 


angles are measured clockwise from north, a geographic 
convention which contrasts with the mathematical one of 
measuring angles counterclockwise from the x 1 direction. 
Figure A.7-4 illustrates this geometry for an earthquake in 
the Peru trench (d E = 102°, p E = -78°) recorded at station VAL 
(Valentia, Ireland; 6 S = 38°, p s = --10.25 0 ). The resulting dis¬ 
tances and azimuths are A = 86°, f = 35°, f' = 245°. 1 

This analysis assumes that the earth is perfectly spherical. In 
fact, the earth is flattened by its rotation into a shape close to an 
oblate ellipsoid, so the radius varies with colatitude approxim¬ 
ately as 

r{0) = R e {l -f cos 2 6), (12) 


A. 7.3 Choice of axes 

Spherical coordinates are also used with axes different from 
the geographic ones. Because the physics of a problem does not 
depend on the choice of coordinates, a set of coordinates that 
simplifies the relevant expressions is used. For example, in 
earthquake source studies, the x 3 axis can be chosen to go from 
the center of the earth to the location of the earthquake. The 
prime meridian, and hence x lt axis can be selected so that 
the fault is oriented in the direction p = 0. These axes simplify 
the description of the seismic waves radiated by an earthquake, 
because the distance A from the source is now the colatitude. 
Moreover, the radiation pattern generally has a high degree of 
symmetry about the fault, so simple functions of p appear. By 
contrast, the radiation pattern need have no symmetry about 
the North pole and Greenwich meridian, so a description in 
those coordinates would usually be more complicated. 

Fortunately, a coordinate system referred to the earthquake 
location does not make describing the propagation of waves 
from the source any more difficult. Because earth structure 
varies primarily with depth, the spherical symmetry about 
the center of the earth is independent of the axis orientation 
chosen. The geographical convention in which the earth rotates 
about the x 3 axis is helpful for navigation. In most seismolo- 
gical applications, however, the north direction has no particular 
significance because the propagation of seismic waves is essen¬ 
tially unaffected by the earth’s rotation. The choice of a prime 
meridian is arbitrary; in the early nineteenth century some 
American maps had it through Washington DC, and some 
French maps had it through Paris. 

A. 7 .4 Vector operators in spherical coordinates 

Because at a point on the sphere the unit spherical basis vectors 
are oriented up, south, and east, the basis vectors at different 
locations are generally not parallel. This makes the vector 
differential operators more complicated, because these oper¬ 
ators involve taking spatial derivatives of vectors. In Cartesian 
coordinates the unit basis vectors are not affected by this 
differentiation because they do not change orientation, so only 
derivatives of the components need be taken. In spherical 
coordinates, because a vector u is 


where R e is the equatorial radius, 6378 km. The flattening 
factor f is approximately 3.35 x 10~ 3 , or about 1/298, so the 
polar radius R p is 6357 km. An average radius can be defined 
as the radius of a sphere with the same volume as the earth, if 
it were a perfect ellipsoid. Because the volume of an ellipsoidal 
earth would be {4l3)nR 2 e R p ^ and a sphere of radius R has 
volume (4/3 )nR 3 , the average radius is 6371 km. For certain 
applications the ellipticity is included in precise distance 
calculations. 


u = u r e r + u e t e + ufy, (13) 

differential operators acting on u must incorporate the derivat¬ 
ives of the basis vectors. Thus, in spherical coordinates, for a 
scalar field yrand a vector field u: 


. A dw A 1 dw 
grad y = e — + e e — — + e, 

dr r dd 


1 dy/ 
r sin 6 dp 


(14) 


1 These distance-azimuth equations also have nonseismological applications 
because ships and aircraft follow the shortest (great circle) paths between two points 
when possible. 


,. 1 5 , 2 \ 1 3 . . ^ \ 1 

div u =-(r u r ) + —;-(sin 9 u e ) + 


du A 


dr 


r sin# d# 


r sin 9 dp 


( 15 ) 
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(16) 


(17) 


These expressions are used when we discuss spherical waves in 
Section 2.4 and the earth’s normal modes in Section 2.9. 

A final point worth noting is that the elements of volume 
and surface used in integrals are different in spherical coor¬ 
dinates from rectangular coordinates. In spherical coordinates 
(Fig. A.7-5) there are several scale factors, so an element of 
surface area is 


dS-r 2 sin 6d6d(j), (18) 

and an element of volume is 

dV = r 2 sin 6dr dOd(j). (19) 



Fig. A.7-5 Definition of the element of volume in spherical coordinates. 
Unlike the case of Cartesian coordinates, the volume element in spherical 
coordinates in not a cube. (Marion, 1970. From Classical Dynamics of 
Particles and Systems , 2nd edn, copyright 1970 by Academic Press, 
reproduced by permission of the publisher.) 


A. 8 Scientific programming 

Most seismological applications require computers, and these 
requirements, especially in exploration applications with very 
large data volumes, have spurred the development of computer 
software and hardware. Some remarks about the use of com¬ 
puters in seismology thus seem appropriate. 

Computer usage in seismology includes several broad and 
overlapping categories: 

• Computers are often used in data acquisition and record¬ 
ing systems. 

• Data are initially displayed and manipulated using 
computers. 

• Subsequent analysis is frequently done using computers. 
For example, seismograms can be filtered to enhance 
certain frequencies or combined to better resolve certain 
features. 

• Theoretical, or synthetic, seismograms are often com¬ 
puted for a range of the parameters under study and com¬ 
pared to data to find the best fit. 

• Computers are used to invert seismological data to deter¬ 
mine the parameters of a model which best matches the 
data. 

• Computer modeling is often used to draw geological in¬ 
ferences from seismological observations. For example, 
seismic velocity data are compared to the predictions of 
models for the velocity of rock as a function of composi¬ 
tion, temperature, and pressure. 

These applications often require scientific programming , a 
programming style used for essentially mathematical applica¬ 
tions. Some problems in this book also require scientific pro¬ 
gramming. Although programming is a matter of personal 
style, this section discusses several points that may be helpful. 
The suggested reading provides some starting points for read¬ 
ers interested in pursuing these topics further. 

A. 8.1 Example: synthetic seismogram calculation 

Consider a program to compute a synthetic seismogram for 
waves in a one-dimensional constant-velocity medium, a math¬ 
ematically idealized string that illustrates features of wave 
behavior. The program is based on u(x, t ), the displacement as 
a function of position x and time t. The displacement is zero at 
the fixed ends of the string, x = 0 and x-L, between which 
waves travel at speed v. As in Section 2.2.5, the displacement 
can be written as the sum of the normal modes of the string, 
each of which is a standing wave with n half wavelengths along 
the string, 

u n {x, t) = sin (n7tx/L) cos {co n t), W 

and vibrates at a characteristic frequency, or eigenfrequency, 
co n ~nnvlL . (2) 
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Synthetic seismogram for string 



x = 0 x = 0.2 


x= 0.7 x = 1 


Fig. A.8-1 Top : Synthetic seismogram for a string showing the direct wave 
arrival (1) and reflections (2, 3) from both ends. Bottom : Geometry 
showing source and receiver positions, and the times of the direct and 
reflected arrivals. 


If a source at position generates a pulse at time zero with 
duration T, the propagating waves are described by a weighted 
sum of the modes 

u{x , f) = £ sin ( nnxIL ) sin ( nnxJL) cos (c o n t ) exp [~{co n t) 2 /4]. 


Given the displacement £) for any position and time, 
a seismogram (“stringogram”) giving the displacement ver¬ 
sus time at a receiver position x r is u{x r , t ). Alternatively, a 
“snapshot” of the displacement everywhere on the string at 
time t Q is u{x , £ 0 ). 

Consider a program to evaluate a synthetic seismogram 
using this sum. For simplicity, we use a string of length 1 m 1 
with a wave speed 1 m/s, a source at = 0.2 m and a receiver at 
x r = 0.7 m. To approximate the infinite sum, the program adds 
up 200 modes. The seismogram (Fig. A.8-1, top) is calculated 
at 50 time steps, covering 1.25 s. This program is written in 

1 It is easy to use arbitrary values on a computer; we could also use 1 km or 1 
furlong. Finding a physical 1 km string is another matter ... 


Fortran, a language that is especially suitable for scientific pro¬ 
gramming and is therefore commonly used in seismology (and 
thus in this book). The program could be also written in other 
languages, but the general points would still apply. 

C SYNTHETIC SEISMOGRAM FOR HOMOGENEOUS STRING 
C DISPLACEMENT U AS FUNCTION OF TIME T 
C CALCULATED BY NORMAL MODE SUMMATION 
DIMENSION U(200) 

PI = 3.1415927 
C 

C PARAMETERS (NORMALLY WOULD COME FROM INPUT) 

C STRING LENGTH (M) 

ALNGTH =1.0 
C VELOCITY (M/S) 

C = 1.0 

C NUMBER OF MODES 
NMODE = 200 
C SOURCE POSITION (M) 

XSRC = 0.2 

C RECEIVER POSITION (M) 

XRCVR =0.7 

C SEISMOGRAM TIME DURATION (S) 

TDURAT =1.25 
C NUMBER TIME STEPS 
NTSTEP =50 
C TIME STEP (S) 

DT = TDURAT/NTSTEP 
C SOURCE SHAPE TERM 
TAU = . 02 
C 

C LIST PARAMETERS 

WRITE (6,3000) 

3000 FORMAT('SYNTHETIC SEISMOGRAM FOR STRING') 
WRITE (6,3001) NMODE 
3 001 FORMAT ('NUMBER OF MODES ' , 16) 

WRITE (6,3002) ALNGTH, C 

3002 FORMAT ('LENGTH (M) ' F7.3, 'VELOCITY, 

X (M/S)', F7.3) 

WRITE (6,3003) XSRC, XRCVR 

3003 FORMAT ('POSITION (M): SOURCE', F7.3, 

X 'RECEIVER', F7.3) 

WRITE (6,3004) TDURAT, NTSTEP 

3004 FORMAT ('SEISMOGRAM DURATION (S) ' , F7.3 , 

X 16, 'TIME STEPS') 

WRITE (6,3005) TAU 

3 005 FORMAT ('SOURCE SHAPE TERM' , F7.3) 

C 

C INITIALIZE DISPLACEMENT 
DO 5 1=1, NTSTEP 
U(I) =0.0 
5 CONTINUE 
C 

C OUTER LOOP OVER MODES 
DO 10 N =1, NMODE 

ANPIAL = N*PI/ALNGTH 
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C SPACE TERMS: SOURCE AND RECEIVER 
SXS = SIN(ANPIAL*XSRC) 

SXR = SIN (ANPIAL*XRCVR) 

C MODE FREQUENCY 

WN = N*PI*C/ALNGTH 
C TIME INDEPENDENT TERMS 
DMP = (TAU*WN) * *2 
SCALE = EXP(-DMP/4.) 

SPACE = SXS * SXR* SCALE 
C 

C INNER LOOP OVER TIME STEPS 
DO 15 J = 1, NTSTEP 
T = DT*(J - 1) 

CWT = COS (WN*T) 

C COMPUTE DISPLACEMENT 

U(J) = U(J) + CWT*SPACE 
15 CONTINUE 

10 CONTINUE 
C 

C OUTPUT SEISMOGRAM FOR LATER PLOTTING 

WRITE (6, 3101) (U (J) , J=l # NTSTEP) 

3101 FORMAT (7F10.4) 

STOP 

END 

This example brings out several points: 

• Is the answer correct? Two different types of error occur 
in scientific programs. First, the program may be wrong. In 
this case, the mathematical formulation correctly describes the 
physical problem, but the program incorrectly implements 
this formulation. This is the usual situation, in which “bugs” 
are identified and corrected. Second, the formulation may be 
wrong, so the program correctly implements an incorrect 
mathematical model. This could occur because of a mathemat¬ 
ical error, like an attempt to sum a divergent series, or a physical 
error, such as an equation that does not correctly describe 
waves on a string. An incorrect formulation is particularly dis¬ 
turbing because it cannot be detected by checking the program. 
For example, Fig. A. 8-2 shows two computer simulations for 
waves bending as they pass from one medium into another with 
higher velocities. Figure A.8-2 {top) uses the correct formula¬ 
tion of Snell’s law (Section 2.5), whereas Fig. A.8-2 ( bottom) 
looks equally convincing but is wrong because the equation 
which the program illustrates is incorrect. 

Programmers check for both types of errors by choosing 
cases for which the results can be predicted analytically and 
comparing the results to those of the program. Several tests 
are easily done for the string. The wave following the shortest 
(direct) path appears at the expected time, 0.5 s (Fig. A.8-1, 
bottom ), because the source and the receiver are 0.5 m apart. 
The next two arrivals, reflections from the ends of the string, 
also occur at the expected times. Moreover, these arrivals have 
polarities opposite that of the initial pulse, as should occur 
(Section 2.2.3) upon reflection at the string’s fixed ends. The 
program can also be checked for different string lengths, 
speeds, and source and receiver positions. Similarly, in addi- 




Fig. A.8-2 Demonstration of the danger that a program accurately 
computes an incorrect mathematical formulation. Top: A correct 
simulation of wave refraction using Snell’s law, sin q/zq = sin i 2 /v 2 . 
Bottom: The same simulation using a wrong formula for Snell’s law, 
q/zq = q/zq. 


tion to synthetic seismograms, displacements along the string 
at fixed times could be computed. Such tests are important, 
because if the mathematical model is not appropriate for the 
physical situation, then time spent debugging, documenting, 
and optimizing the program is wasted. 

• The program is reasonably comprehensible. Several fea¬ 
tures help clarify the program. The program’s purpose and 
method are stated. Variable names somewhat resemble those in 
the equation: “SXS” is sin x s , and so on. Comments identify the 
functions of portions of the program. 

• The program uses loops and arrays . The seismogram is 
described by the array U(J), and its values at successive times 
are calculated by looping. Using an array, rather than discrete 
variables UT1, UT2, etc., makes the program clearer, closer 
to the mathematical formulation, and simplifies output. The 
loop structure also makes the program clearer and allows 
the number of time steps to be changed simply by changing the 
parameter NTSTEP. Similarly, the number of modes is easily 
changed. 

• The output is labeled. The seismogram was placed in an 
output file for later plotting. The parameters used to compute 
the seismogram are included, so examination of the output 
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C OUTER LOOP OVER MODES 

DO 10 N = 1, NMODE 

terms for each mode 
that do not depend on time 

C INNER LOOP OVER TIME STEPS 

DO 15 J = 1, NTSTEP 

terms that depend on time 

C COMPUTE DISPLACEMENT 

15 CONTINUE 

10 CONTINUE 

Fig. A.8-3 Structure of the loops for the string synthetic seismogram 
calculation. 


shows how it was computed. This helps avoid the common 
situation where, given a large collection of computer output, 
cases are rerun because it is unclear what parameters were 
used. Moreover, subsequent “improved” versions of the pro¬ 
gram can be checked to see whether they give the same results. 

• The program is somewhat efficient. Some thought is gener¬ 
ally put into optimizing scientific programs to make them run 
rapidly. The program could find the displacement by looping 
over time and summing all the modes at each time step. How¬ 
ever, consideration of the equation shows that three terms, 
sin ( nxx/l ), sin ( nnxjl ), and exp [~-{co n t) 2 l4] are evaluated only 
once for each mode, whereas only cos (co n t) is evaluated for 
each time step. It is thus more efficient to loop over the modes 
and evaluate each at all times (Fig. A.8-3). Because the outer 
(mode) loop is executed 200 times, whereas the inner (time) 
loop is executed 200 x 50 = 10,000 times, the inner loop should 
be as efficient as possible. The program would run more slowly 
if the loops were reversed. The difference, though not signific¬ 
ant for this calculation, might be significant for much larger 
numbers of time steps and modes. 

Further improvements could be made to fully optimize the 
program. Optimization is not an end in itself, because the 
programmer’s time and the intelligibility of the program are 
also important. Programmers typically try to write reason¬ 
ably optimized programs without making them impossible to 
understand and debug. Once fully tested, a program that will 
be used heavily may be worth further optimization if the com¬ 
puter time savings justify the effort required. There is no point 
in “getting the wrong answer as fast as possible.” 2 Certain 
computers, such as those using parallel processors, may require 
specialized optimization. 

A. 8.2 Programming style 

The style in which programs are written can make them easier 
to develop, debug and use. A few suggestions, though not abso¬ 
lute rules, may be useful. 

2 Kernighan and Plauger (1978). 


• Document the program. Computer programs can be almost 
useless without adequate documentation. Stonehenge has been 
described as “the world’s largest undocumented computer 
system.” 3 Failure to document is often justified by the assump¬ 
tion that the program will not be used again. This rationaliza¬ 
tion is self-fulfilling, because even the author may find an 
undocumented program difficult to reuse once the details are 
forgotten. 

Documentation should state the program’s goals and 
method. The input and output variables, their units, and how 
they are defined should be listed. Implicit assumptions and 
restrictions are worth noting. Comments should identify major 
portions of the program and describe their functions. 

Documentation is best written when writing a program 
because it can aid in debugging. Moreover, once a program is 
fully written, it is harder to remember how it works. Documen¬ 
tation included in the program is less prone to be lost than that 
written separately. 

Finally, documentation helps scientists exchange programs 
and work in collaboration. This can be useful, except in the 
apocryphal cases of programmers writing gigantic undocu¬ 
mented programs to maximize their job security. 

• Use modular programming. Large programs can generally 
be divided into smaller subroutines or functions, which can 
be used like the functions (e.g., sine, square root) supplied by 
many computer languages. Each subroutine can be tested sep¬ 
arately and then used in various programs. Subroutines can 
handle applications that frequently recur, such as reading or 
plotting data or carrying out a mathematical operation. This 
approach saves the time needed to write and debug portions of 
a program similar to one already available. Moreover, the 
overall structure of a program containing a set of calls to sub¬ 
routines is generally easier to understand, because many com¬ 
plexities are isolated into subroutines. 

• Make programs comprehensible. It is helpful to be able to 
understand programs once written. Clear documentation and 
modular programming help. In addition, it should be easy to 
tell what portions will be executed under which circumstances. 
For this purpose, portions of a program should be executed 
sequentially, rather than jumping backwards and forwards 
within a program. 

Similarly, the statements themselves can be written clearly. 
The use of mnemonic variable names and natural groupings of 
variables can help. For example, it is somewhat unclear that 

X = 0.23873*A/(Y*Y*Y) 

gives the average density X of a planet with mass A and radius 
Y, whereas 

RHO = AMASS / ((4.0/3.0) * PI * (RADIUS’ 1 '*3)) 


3 Brooks (1975). 
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is clearer. For clarity, the latter expression is more verbose 
than required, has k previously defined, and is slightly less 
efficient. 




S| Exponent 


Mantissa 


• Don't be clever. Sometimes the shortest, “cleverest” way 
of programming something can be the worst. In addition to 
giving rise to lack of clarity, some shortcuts make it difficult 
to transfer programs between computers. This is especially true 
of programs that exploit specific properties of an individual 
computer or compiler, such as local variants of a standard 
programming language. 

• Keep a perspective on precision. The program calculates 
and manipulates numbers that, at least in theory, correspond to 
physical entities. It is worth keeping track of the precision asso¬ 
ciated with the data and other quantities, and of that required 
to compute the desired results. 

• Organize programs and data. Related programs and the 
associated files can be grouped into directories which include 
files listing and explaining the directory’s contents. Data files 
can be organized similarly. Often seismograms, for example, 
go through multiple processing stages carried out by different 
programs. A common practice is to use specific types of file 
names to indicate various intermediate stages. In addition, the 
data files begin with headers , information identifying the data 
and recording the operations applied to it. The headers and file 
names should be updated by the programs themselves, rather 
than “by hand” at each stage. The output, whether text or 
graphic, should contain the parameters required to replicate 
the result. This can be especially important for interactive data 
processing because input files are not kept. 

A. 8.3 Representation of numbers 

Several simple concepts about numerical calculations on a 
computer are worth bearing in mind. One is the consequences 
of the way in which numbers are represented and manipulated. 
Because computers use binary (base 2) arithmetic, numbers 
are written as sets of bits, single binary digits, grouped into 
words. Some general ideas about these representations can be 
illustrated without going into the schemes used by various 
computers. 

Integers are represented by their binary equivalent. Thus 46 
(decimal) is 101110, because 

46 = 1 x 2 5 + 0 x 2 4 + 1 x 2 3 + 1 x 2 2 +1 x 2 1 + 0 x 2°. 

Many computers represent integers by 16- or 32-bit words. 
The word length governs the range of possible integers. For ex¬ 
ample, using 16 bits, one of which indicates the sign, the largest 
positive integer that can be represented is 

111 1111 1111 1111 (binary) = 2 15 - 1 = 32,767. 


Fig. A.8-4 Representation of a floating point number using 32 bits. 


Because a greater range is needed for scientific computation, 
floating point numbers are used: 

number = (mantissa) x 2 ex P onent . 

Floating point numbers can accommodate fractions, with digits 
to the right of the binary point representing negative powers 
of two, just as digits to the left of the point represent positive 
powers of two. For example, 

46.625 (decimal) = 1 x 2 5 + 0 x 2 4 + 1 x 2 3 + 1 x 2 2 + 1 x 2 1 
+ 0 x 2° + 1 x 2 _1 + 0 x 2 -2 +1 x 2~ 3 
= 101,110.101 (binary) = 0.101110101 x2 6 . 

To represent binary floating point numbers on a computer, a 
certain number of bits are assigned to the mantissa and the 
exponent. Figure A.8-4 shows one way in which a single pre¬ 
cision floating point number might be represented by a 32-bit 
word. One bit is reserved for the sign of the mantissa, 8 bits 
are used for the exponent including its sign, and the remaining 
23 bits contain the mantissa. The number of bits available for 
the exponent determines the range of the floating point num¬ 
bers. Because 2 8 = 256, the exponent can represent numbers 
between approximately 2 127 and 2“ 128 or approximately 1CP 8 
to 10 -39 . The number of bits in the mantissa determines the pre¬ 
cision or number of significant digits. Because 2~ 23 is approxim¬ 
ately 10 -7 , the maximum number of significant decimal digits is 
about seven. Further precision can be obtained using double 
precision numbers with additional bits for the mantissa. The 
precise values of the range and the precision depend on details 
of the implementation. 

The range and precision in use are worth bearing in 
mind because computers do not always issue “overflow” or 
“underflow” warnings. The computer may assign a value, such 
as the largest floating point number, and proceed. It can be 
frustrating to find that the peculiar answers produced by a 
program result from numbers outside the computer’s range. 

A related malady is round-off error , the loss of computa¬ 
tional precision due to the limited number of significant digits. 
To illustrate the concept, suppose that a computer used six bits 
for the mantissa. The decimal addition 

0.65625 + 0.96875 = 1.625 

would, in binary, be 
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0.10101 + 0.11111 = 1.10100, 

which, because no precision was lost, equals the exact answer. 
Now, consider the decimal addition 

5.25 + 0.96875 = 6.21875, 

which, in binary, becomes 

0.101010 x2 3 +0.111110x2°. 

To carry out the binary addition, because the numbers have 
different exponents, the mantissa of the smaller number is 
shifted to produce a common exponent. If some of the bits rep¬ 
resenting the smaller number are lost, inaccuracy may result. 
For example, in this case, 

0.101010 x 2 3 + 0.000111 x 2 3 = 0.110001 x 2 3 

= 6.125 (decimal). 

The precision available on a computer is generally adequate 
to avoid significant round-off error. Nonetheless, it is a poten¬ 
tial problem to keep in mind, especially in long calculations or 
in those such as a series sum where the answer is the difference 
between large numbers. 

A.8.4 A few pitfalls 

Difficulties often can be avoided by considering how various 
statements in the program will be executed. This is especially 
the case when using compilers that provide little error checking 
and few helpful warning and error messages. The computer, 
following its explicit rules, may yield results differing from 
those expected. The foibles here are for Fortran, but similar 
ones often appear in other computer languages. 

• Statement execution . Problems often stem from the distinc¬ 
tion between integers and floating point numbers. For example, 
if I and J are integer variables, 

J = 5 

I=l/J 

yields zero, because integer division yields an integer. This 
problem is not cured by setting the result equal to a floating 
point variable, or performing a floating point operation on the 
integer result: 

X=l/J 
Z = 1.0*(l/J) 

yield zero, because division is done as an integer operation, and 
the result (0) is converted to floating point (0.0). On the other 
hand, most compilers give 0.2 as the result of 


X = 1.0/J, 

although a conservative policy is to explicitly convert the integer 
to floating point 

X=1.0/FLOAT(J). 

A second class of problems can result from the order in 
which operations are performed. For example, it may be 
unclear whether 

- 1 . 0 * *2 

should be interpreted as (-1.0) 2 = 1.0 or -(1.0) 2 = -1.0. 
Although the computer language rules are explicit, it may be 
wise to use parentheses, e.g., 

(- 1 . 0 )* *2 

to ensure that operations are carried out as desired. The 
additional parentheses can also make the program more 
comprehensible. 

• Subroutines. Subroutines are heavily used in writing scient¬ 
ific programs. As a result, problems can result while using 
computer languages like Fortran in which what appear to be 
arguments passed to a subroutine are actually the locations in 
memory of these arguments. 

A common error is exemplified by the following program 

CALL SUB(1.0) 

X = 1.0 

WRITE (6,*) ' X = ' , X 

STOP 

END 

SUBROUTINE SUB(Y) 

Y = 5.0 
RETURN 
END 

which, when executed, yields “X = 5.0.” Because Y, a para¬ 
meter in the subroutine definition, was set equal to 5.0, the 
value of the corresponding parameter in the subroutine call, 
“1.0” has been redefined as 5.0. This situation, which some¬ 
times underlies inexplicable behavior by programs, can be 
avoided by not passing numerical values of an argument expli¬ 
citly to a subroutine if the argument will be redefined. For 
example, had the first statements been 
z = 1.0 

CALL SUB(Z) 

the variable Z would equal 5.0, but “1.0” would not be 
affected. 

Other errors occur when either the type or number of argu¬ 
ments in a call to a subroutine do not match those in its defini¬ 
tion. For example, calling a subroutine with an integer variable 
may yield unexpected results if the definition is in terms of a 
real variable. 
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• Arrays. Scientific computing often involves dealing with 
arrays , groups of data addressed by their indices. For example, 
a seismogram giving a single component (e.g., vertical) of 
ground motion can be written as an array (U(l), U(2) . . .) of 
displacement versus time. Similarly, a seismogram giving all 
three (vertical, north-south, east-west) ground motion com¬ 
ponents can be written as a two-dimensional array 

U(l, 1), U(l, 2), U{1, 3) , U(l, 4) . . . 

U{2, 1), U(2, 2), U{2, 3), U(2, 4) . . . 

U (3 , 1), U(3, 2), U {3 , 3), U(3, 4) . . . 

whose first index gives the component, and second index indic¬ 
ates the time. 

Arrays are defined initially by statements giving their dimen¬ 
sions, i.e., 

DIMENSION A(N, M) 

or 

REAL A (N, M) . 

Typically, the computer selects a memory location for the first 
element in A and reserves N x M successive locations. Sim¬ 
ilarly, N x M x R locations are reserved for a three-dimensional 
array dimensioned (N, M, R). In Fortran, regardless of the 
number of dimensions, an array is stored as one-dimensional 
with the first index varying the most rapidly, then the second, 
and so on. In other words, if A is dimensioned (2, 3), the stor¬ 
age order is 

A(1,1) , A(2,1), A(l, 2), A{2,2), A(l,3), A(2,3). 

For two-dimensional arrays, this can be thought of as storing 
the array by columns. An individual array element is found by 
calculating its location relative to that of the first element. 
Thus, for an array dimensioned (N, M), with element (1, 1) at 
location 1, element (I, J) is found at location 

1 + (I - 1) + (J - 1) X N. 

Several computational difficulties can arise in dealing with 
arrays. A common set of errors involve being “off by one,” 
either by starting or ending on the wrong element. This is 
especially easy because some computer languages (e.g., For¬ 
tran) start with the first element in an array being “1,” whereas 
others (e.g., “C”) start with the first array element as “0.” Thus 
one needs to make sure that the array elements correspond 
to the expected variable values, such as seismic record times. 
Often, when an array index is computed by the program, an 
error yields an index outside the bounds dimensioned for the 
array. Because many compilers do not check for such errors 
unless specifically requested, a statement like 

A ( 9) = 4.0 

will usually be executed even for an array dimensioned 

DIMENSION A(5). 

Typically, the computer places 4.0 in whatever is 8 locations in 
memory beyond A(l). This location may contain some other 
variable, or a portion of the program itself. Often the program 
continues until it requires the contents of the overwritten loca¬ 
tion, at which point several things may occur. At best, the pro¬ 
gram “crashes”; at worst, it continues the calculation with 
erroneous values that propagate. Array element out-of-bounds 
problems are among the most common and most frustrating 


difficulties in scientific programming. When a compiler pro¬ 
vides array bounds checking, it is worth using. 

The nature of array storage can also lead to inefficient pro¬ 
grams. On many computers, data which are actually on disk 
can be treated as resident in memory, and are automatically 
“swapped” into physical memory when needed. For efficiency, 
large adjacent regions of the disk are often swapped into phys¬ 
ical memory together. Efficient programs minimize swapping 
by making the most possible use of data that reside in phys¬ 
ical memory. By contrast, inefficient programs can produce 
“thrashing,” a situation in which much of the computer’s time 
is spent swapping rather than computing. 

For example, consider 4 

DIMENSION A(1000, 1000) 

DO 10 I = 1,1000 
DO 10 J = 1,1000 
10 A(I, J) = I + J 

Because the elements of A are stored in column order, A(l, 1) 
and A(l, 2) are a thousand locations apart. It would be more 
efficient to reverse the loops 

10 A (J, I) - I + J 

so that adjacent locations (A(l, 1), A(2, 1) . . .) were used 
successively. 

• Uninitialized variables. Problems frequently result from 
uninitialized variables : those used in calculation without their 
values being set. A common example, summing an array 

DO 10 I = 1, N 
10 SUM = SUM + A (I) 

can give strange results unless the compiler initializes SUM 
as zero. Because this is not always the case, it is thus wise to 
explicitly initialize, e.g., 

SUM =0.0 

before executing the loop. Proper initialization also helps to 
ensure that programs do not give different results on different 
computers. 

• The computer may be wrong. Although most problems 
result from programming errors, a very small fraction of the 
time the error may be the computer’s. Compilers have been 
known to contain “bugs” in common routines such as square 
root, tangent, or complex arithmetic. This tempting explana¬ 
tion for the failure of a long and intricate program can gener¬ 
ally be rejected unless a test program that carries out only the 
suspect operation yields the wrong answer. 

A.8.S Some philosophical points 

To close our discussion, a few general thoughts are worth 
considering. Historically, computers were considered a scarce 
and valuable resource. Currently, as computer power increases 
and costs fall, it is increasingly practical to carry out investi- 

4 Hatton (1983c). 
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gations numerically. One example is the change, both in ex¬ 
ploration and in global seismology, from earth models whose 
properties vary only with depth, to three-dimensional models 
that are evaluated numerically. 

The role of analytic solutions is also changing. In addition 
to the traditional goal of providing exact solutions to simplified 
problems, analytic solutions provide test cases for numerical 
solutions of more complex problems. Analytic solutions can 
also yield the insight needed to evaluate numerical results. 

Along with the increase in the complexity of problems that 
can be solved computationally comes an increase in the volume 
of output. Fortunately, a parallel development has been the 
increasing role of graphic output, often in color. The proverb 
“A picture is worth a thousand words” may be unduly con¬ 
servative in this context. A thousand words on a computer 
might be 32,000 bits; graphic output often makes it possible to 
visualize data with millions of bits. 

Finally, software such as spreadsheets or programs with 
sophisticated general mathematical capabilities often eliminates 
the need to write programs for a specific application. In this 
book, we do not assume that such software will be used for the 
problems, although many could be done this way. We think 
that programming without using such software gives a deeper 
understanding of the underlying principles. Hence, in educa¬ 
tional applications, we strongly favor programming, even if in 


non-educational applications ease of use may favor sophistic¬ 
ated software. 

Further reading 

Many texts cover portions of the mathematical material summarized here. 
Feynman (1982) discusses general issues of the relations between math¬ 
ematics and science. Butkov (1968) and Menke and Abbott (1990) provide 
introductions to many of these topics. Fung (1969), Hay (1953), Jeffreys 
and Jeffreys (1950), and Marion (1970) treat vectors, vector transforma¬ 
tions, and vector differential operators. Applied linear algebra texts such 
as Franklin (1968) and Noble (1969) deal with the range of the subject 
including numerical methods. 

Articles by Hatton (1983a-d, 1984a,b, 1985) provide a broad and witty 
introduction to computer science for geophysicists. Eckhouse and Morris 
(1979) and Sloan (1980) cover topics in computer software, including the 
representation of numbers and arithmetic operations. Kernighan and 
Plauger (1976, 1978) discuss topics in programming style. Brooks (1975) 
treats issues in the development and organization of computer software. 
Numerical analysis texts like Froberg (1969) cover round-off and other 
sources of error in numerical computations. Harkrider (1988) gives an enter¬ 
taining anecdotal account of early (c.1960) computer usage in seismology. 

The application of spherical geometry to the paths between an earth¬ 
quake and a receiver, including the effects of the earth’s ellipticity are dis¬ 
cussed by Ben-Menahem and Singh (1981) and Bullen and Bolt (1985). 
The theory of the earth’s shape is treated by Cook (1973) and Jeffreys 
(1976). 


Problems 


1. Find the angle between the vectors (1,4,2) and (2, 3,1). 

2. Show, using index notation, that for the three-dimensional vectors 
a, b, c: 

(a) a x b is perpendicular to both a and b. 

(b) | a x b | = j a || b | sin 6, where 6 is the angle between the two 
vectors. 

(c) a • (b + c) = a • b + a • c. 

(d) ax(b + c) = axb + axc. 

(e) a • (bxc) = b • (cxa) = c * (axb). 

(f) ax(bxc) = b(a • c)-c(a-b). 

3. Show that for arbitrary matrices A, B, and C: 

(a) (A£) T =B T A T . 

(b) ( ABC) t =C t B t A t . 

4. Prove the following properties of determinants for the case of a 
2x2 matrix: 

(a) The determinant of a matrix equals the determinant of its 
transpose. 

(b) If two rows or columns of a matrix are interchanged, the de¬ 
terminant has the same absolute value, but its sign changes. 

(c) If a multiple of one row (or column) of a matrix is added to 
another row (or column), the determinant is unchanged. 

(d) If two rows or columns of a matrix are the same, the 
determinant is zero. 

5. Express the determinant of a 3 x 3 matrix using the definition in 
Eqn A.4.17. 

6. Prove that if A has an inverse, the two solutions x and y satisfying 
Ax = b and Ay = b are equal. 


7. Find the inverse of the matrix 


both by the cofactor method and by row operations. Check that 
the solution is in fact the inverse. 

8. Show that the inverse of a 2 x 2 matrix A is given by 


9. Show that A, the transformation matrix for a rotation about the 
e 3 axis (Eqn A.5.9) satisfies A T A = I and is thus orthogonal. 

10. Prove that the magnitude of a vector is preserved by an orthogonal 
transformation. 

11. Expand the determinant that give the eigenvalues of a 3 x 3 matrix 
(Eqn A.5.19) and verify that the invariants (Eqn A.5.21) are the 
coefficients of the characteristic polynomial. 

12. Prove the following vector identities using index notation: 

(a) For any vector field u(x), V *(Vxu) = 0. 

(b) For any scalar function 0(x), V x V</>= 0. 

13. For the vector field u(x, y, z) = (3 x 1 2 y 2 + z, 2x 3 4 5 6 y + 2 y, x), find: 

(a) V • u. 

(b) Vxu. 

(c) V 2 u. 

(d) A scalar field $(x, y, z) such that u = V<p. 
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14. Use index notation to show that the Laplacian in Cartesian coordin¬ 
ates of any vector field u(x) satisfies 

V 2 u = V(V • u)- Vx Vxu. 

15. Show that at any point in a spherical coordinate system, the spher¬ 
ical basis vectors (e r , e 0 , e^) form an orthonormal set. 

16. Use Eqn A.7.6 to derive the angular distance A between the loca¬ 
tions of an earthquake and a seismic station as given in Eqn A.7.7. 

Computer problems 

The solutions may be useful for other problems in this and other 
chapters. 

C-l. Find the largest integer your computer allows by starting with 
“2,” “2 x 2,” “2 x 2 x 2,” and doing successive multiplication by 
2. What happens when you exceed this number? Do the same for 
floating point numbers using “10.0” instead of “2” in both single 
and double precision. Does double precision allow larger floating 
point numbers? 

C-2, Find when your computer starts to show round-off error by start¬ 
ing with “10.0” and doing successive multiplications by 10.0. At 
each step, add 1.0 to the result and subtract the two numbers. 
When does the difference become zero? Do the same in double 
precision. 

03. Write subroutines to do the following operations on an input 
vector in three dimensions: 

(a) Find the magnitude of a vector. 

(b) Find the sum of two vectors. 

(c) Find the scalar product of two vectors. 

(d) Find the vector product of two vectors. 

Your subroutines should include comment lines explaining the 
purpose of the routine and the various inputs and outputs. 

C-4. Write a subroutine using the necessary subroutines from problem 
C-3 to find the angle between two vectors. 

05. Use the solutions to problems C-3 and C-4 to find the magnitude, 
sum, scalar product, and vector product of the vectors (1, 4, 2) 
and (2, 3,1), and the angle between the two vectors. 

C-6. (a) Write a subroutine to multiply an n x m matrix by an m- 
element vector. 

(b) Write a subroutine to multiply an n x m matrix by an m x r 
matrix. 

(c) Write a subroutine to find the determinant of a 3 x 3 matrix. 
C-7. (a) Write a subroutine that uses Gaussian elimination with 

partial pivoting to solve the system of equations Ax = b. The 
routine should take an arbitrary 3x3 matrix A and 3-element 
vector b as inputs. The program should test the solution by 
multiplying Ax and subtracting b from the result. The sub¬ 
routines from C-6 may be helpful. 

(b) Use the subroutine to solve 
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C-8. (a) Write functions that return the values of the 8 if and £ ijk sym¬ 
bols given the indices as arguments. Test the functions and 
show that they give the correct values. 

(b) Write a program that uses these two functions to prove the 
identity 

£ ijk £ ist ” fys^kt — 3jt&ks 

by testing all possible combinations of indices. 


C-9. (a) Write a subroutine to invert a 3 x 3 matrix using elementary 

row operations. The subroutine should first check to see if 
the matrix is singular. It should test the result by multiplying 
by the original matrix. 

(b) Use this routine to invert 

\ -1 -1 ' 

3-1 2 . 

2 2 3 

V 7 

C-10. (a) Write a program to solve a 3 x 3 system of equations Ax = b 
using the matrix inversion routine from the previous prob¬ 
lem. The program should test the solution by multiplying Ax 
and subtracting b from the result. The subroutines from C-6 
may be helpful. 

(b) Use the program to solve the system of equations in C-7. 
C-ll. (a) Write a subroutine to find the roots of a general cubic equa¬ 
tion using the method given below. 1 

A cubic equation y 3 + py 2 + qy + r = 0 may be converted to 

x 3 + ax+ b = 0 

by defining 

y = x-p/3, a = {3q-p 2 )I3 , b = (2p 3 -9pq +27r)/27. 

If p, q, and r are real, the quantity 
c = b z !4 + a 3 /27 

characterizes the roots: if c > 0, there is one real root and two 
conjugate imaginary roots; if c = 0, there are three real roots, 
of which two are equal; and if c < 0, there are three real and 
unequal roots. Using 

A = (-b/2 + c 1/2 ) 1/3 , B = (-b/2 - c m ) m , 

the values of x given by 

x = A + B, [~(A + B) + (A-B)4^3]/2, 

-UA + B) + (A-B)J-3]I2 

are the roots. 

The subroutine requires complex arithmetic and should 
test the roots by substituting back into the equation. 

(b) Use the result to solve 

y 3 - 8y 2 + 19y -12 = 0. 

C-12. (a) Write a subroutine to find the eigenvalues and eigenvectors 
of a real, symmetric 3x3 matrix, using the results of C-ll. 
The program should check that the eigenvectors and eigen¬ 
values satisfy their definition. Be careful to avoid dividing by 
zero. 

(b) Use this subroutine to find the eigenvalues and eigenvectors of 

\ 2 3" 

2 4 5. 

3 5 6 
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1 Beyer (1984). 
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C-13. (a) Write a program that accepts the latitude and longitude of 
two points on the earth’s surface and finds the angular dis¬ 
tance and distance along the earth’s surface between them, 
and the azimuth and back azimuth. 

(b) Use your program to find the distances and azimuths 
between: 

(i) Cairo, Illinois (37°N, 89°W) and Cairo, Egypt (30°N, 


(ii) Berlin, New Hampshire (44.5°N, 71.5°W) and Berlin, 
Germany (52.5°N, 13.5°E). 

(iii) Montevideo, Minnesota (45°N, 95.5°W) and Monte¬ 
video, Uruguay (35°S, 56°W). 

(iv) Mexico, Maine (44.5°N, 70.5°W) and Mexico City, 
Mexico (19°N, 99°W). 
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Solutions to selected odd-numbered 
problems 


Note that in many problems {as in reality), the solution varies 
depending on the interpretation of the data or the assumptions 
used. 


Chapter 2 

(1) r 12 =o,t 12 = i. 

(3a) (3,-2, 5). 

(3b) (2,1,3). 

(3c) (5, 3, 0)/-/l4. 

(5a) a; = 2, a 2 = 0, C7 3 = -2; n<b = (1,1, 0)/^, n< 2 > = (0, 0,1), 
n< 3 > = (l,-l,0)/V2. 

(5b) t= 2. Planes have normals (1,0,0) and (0,1, 0). 

(7b) -150 kbar. 





0 -2 1 



(7c) 


D = 

-2 -5 3 

1 3 5 



(7d) 

450 km. 



(9) 

2%. 






67 + Gji 0 


0 

(11a) 


0 6X + 2ji 

2(i 



0 2(i 


67 + 4 (i 

(lib) 

187 + 16/1. 




(13a) [U + 2p)/p]M 

(13b) (p/p) 112 . 

(15) ,/3. 

(17a) a- 11.25 km/s;/3 = 6.18 km/s. 

(17b) a//J = 1.82. 

(19a) 0.8 km, 8 km, 800 km. 

(19b) 0.000125 s, 0.125 s, 12.5 s; 8000 Hz, 8 Hz, 0.08 Hz. 

(21) i 2 = 13°; i 3 = 17°; z’ c = 37°. 

(23a) For the i 1 = 0° wave: i 2 = 0°, l x = 2 km, l 0 = 2 km, T= 3.3 s. For 
the i { = 30° wave: i 2 = 49°, / 1 = 2.3 km, J 2 - 3.0 km, T=4.3 s. 

For the i x = 0° wave: Sj = (0,1) s/km, | Sj | = l/v v s 2 = (0,2/3) s/km, 
| s 2 1 — l/v 2 . For the i x = 30° wave: s 1 = (0.5, ^312) s/km, 
j Sj | = l/v v s 2 = (0.5,0.44) s/km, | s 2 | = l/v 2 . 


(25a) 


(25b) 


(25c) 

(25d) 

(27a) 

(27b) 

(29) 

(33) 

(35a) 

(35b) 

(35c) 

(35d) 

(37a) 

(37b) 

(C-3) 


X P J = B 1 exp [i{cot - k x x + k x r p z)], 
x ¥ r = B 2 exp [i{cot - k x x - k x r p z}]. 


<$> r = A 2 exp [i{G)t-k x x-k x r a z)]. 


a 


o 2 o o 2v f a 2 y 

dzdx dx 2 d 2 z 


/~r - jl 

+ 2 <t a 2 ®' 

+ 2(1 

'o 2 o + d 2 W ' 

a zz~ A 

^ dx 2 dz 2 J 

^ dz 2 dxdz J 


2r a A 2 + {l-r p )B 1 + {l- rp)B 2 = 0, 

A 2 {X + Xr\ + 2(ir 2 a ) - 2(ir p B 1 + 2jir p B 2 = 0. 

B 2 /B 1 = -l,A 2 /B 1 =zO. 

Ms* = 1M. Mpr = EM Ak = !lA. 

Ms; Ml’ Ms “Ml E s; Bj ’ E s , 

For Srf: , slab = 26”. , surf =4”. For ScSp-. i slab = 50”, i sutf = 20”. 

50 km. 

m 1 = 0.S8 i co 2 = 1.16. 

(3/2)(5 cos 2 6 sin 0-sin 6 ). 

2591 s. 

4.4 km/s. 

8.3 km/s, 13.8 km/s, 74.5 km/s. 

c = 5.36 km/s, 5.01 km/s, 4.05 km/s; X- 11,440 km, 1312 km, 
307 km. 


Aco/co observed: 0.095. 

Ao)/(o predicted: 0.038. 

For P waves at the CMB, T mc - 0.975, R mc = -0.025, 

T cm = 1.025, R cm = 0.025, E r /Ei = 0.0006, 0.9994. 

For 5 waves at the CMB, = 2, R wc = 1, E r /Ej = 1, E t /Ej = 0. 


Chapter 3 

(I) a 0 = 5.7 km/s, a 1 = 7.8 km/s, h 0 = 23 km. 

(3a) a c = 6.7 km/s, a m = 7.8 km/s. 

(3b) 3.1 km. 

(3c) 6.1 km. 

(5) a c ~6.5 km/s, a m = 8 km/s, dip = 4°, b u = 50 km, h d = 30 km. 

(II) 24,000,000. 


(23b) 
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9.34 km/s. 

11.24 km/s. 

For D = 0 km: p 40 = 8.3 s/degree, p 60 = 6.9 s/degree, i 4Q = 26°, 
i 6Q = 21°. For D = 600 km: p 40 = 7.9 s/degree, p 60 = 6.6 s/degree, 
/ 40 = 52 V 60 = 41°. 

4.5 s/degree. 

13.4 km/s. 

SKKS. 

~ 58-3 ^r, ^Wo^o) = 3.0 hr, ^ 1/e ( 0 ^ 3 o) = 4-2 hr. 

54,300 km for 0 T 30 ,76,450 km for 0 S 30 . 


M = 1.94 x 10 24 kg, p = 11 g/cnT 


There are two solutions: 


Solution 1: 

'-2.14 0 0 

0 2.01 0 
0 0 0 . 1 : 


■1.135 0 0 

0 0 0 
0 0 1.135 

J 

-1.005 0 0 

0 2.01 0 
0 0 -1.005 


(Double-couple scalar moment)/(original scalar moment); 
0.546. 

(CLVD scalar moment)/(original scalar moment) = 0.838. 
Solution 2: 


Chapter 4 


Earthquake a: {<j>, 8, X) 1 = (310°, 65°, 90°) (thrust); (0, 8, X) 2 - 
(130°, 25°, 90°) (thrust); P axis (azimuth, plunge) = (40°, 20°); 

T axis = (220°, 70°); B axis = (130°, 0°). 

Earthquake b: (0, 5, X) 1 = (176°, 80°, 195°) (right-lateral 
strike-slip); {(j), 8, X) 2 = (83°, 75°, 350°) (left-lateral strike-slip); 

P axis (azimuth, plunge) = (40°, 18°); T axis = (309°, 3°); 

B axis = (209°, 72°). 

Earthquake c: {(j), 8, X) 1 = (9°, 90°, 180°) (right-lateral 
strike-slip); (0, 5, X) 2 = {99°, 90°, 0°) (left-lateral strike-slip); 

P axis (azimuth, plunge) = (234°, 0°); T axis = (144°, 0°); 

B axis = (undefined, 90°). 

Earthquake d: First solution: (0, <5, X) 1 = (16°, 85°, 90°) (dip 
slip); (0, 8, X) 2 = (196°, 5°, 90°) (thrust); P axis (azimuth, plunge) 
= (106°, 40°); T axis = (286°, 50°); B axis = (196°, 0°). Second 
solution: (0, 8, X^ = (78°, 66°, 25°) (left-lateral strike-slip); 

(0, 8, X) 2 = (337°, 67°, 154°) (right-lateral strike-slip); 

P axis (azimuth, plunge) = (28°, 1°); T axis = (297°, 34°); 

Baxis = (119°, 56°). 


0 0 0 
0 0 0 
0 0 AL 


-2.14 0 0 

0 2.01 0 
0 0 0.13 


-2.075 0 0 

0 2.075 0 

0 0 0 

'-0.065 0 0 ' 

0 -0.065 0 

0 0 0.13 


-2.14 0 0 0 0 0 

0 2.01 0 = 0 0.94 0 

0 0 0.13 0 0 -0.94 

V / v 

2.14 0 0 

+ 0 1.07 0 . 

0 0 1.07 

v / 

(Double-couple scalar moment)/(original scalar moment) = 
0.452. 

(CLVD scalar moment)/(original scalar moment) = 0.892. 

4.24 mm/yr, 0.85 mm/yr, 0.42 mm/yr. 

M 5 = 5.2. 

assuming p = 3 x 10 11 ; 200,000 km; 43,333 km. 

-0.04 Hz. 

0.003-0.03. 

Japan: 8 mo. (M > 6); 7 yr (M > 7); 65 yr (M > 8). S. California: 
1 yr (M > 6); 8 yr (M > 7);« 100 yr (M > 8). New Madrid: 92 yr 
(M > 6); 920 yr (M > 7); 9200 yr (M > 8). 

Earthquake a: n = (0.453, -0.785, 0.423); 

d = (0.098,0.515, 0.852). 

Earthquake b: n= (0.853,-0.150,0.500); 

d = (0.492,-0.087,-0.866). 

Earthquake c: n = (0.853, -0.150, 0.500); 

d= (-0.492,0.087,0.866). 

Earthquake d: n= (-0.633, -0.754,0.174); 

d = (0.758,-0.559,0.337). 

Earthquake e: n = (-0.633, -0.754, 0.173); 


(Double-couple scalar moment)/(original scalar moment) = 0.999. 
(CLVD scalar moment)/(original scalar moment) = 0.054. 


d = (-0.758, 0.559, 

-0.337). 

' 0.088 

0.157 

0.427' 

0.157 

-0.808 

-0.451 

-0.427 

V 

-0.451 

0.720 

' 0.840 

-0.148 

-0.492 

-0.148 

0.026 

0.087 

-0.492 

V 

0.087 

-0.866 

'-0.840 

0.148 

0.492' 

0.148 

-0.026 

-0.087 

0.492 

-0.087 

0.866 
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(-0.960 


Earthquake d: 


-0.218 

-0.082 

v 


-0.218 -0.082^ 
0.843 -0.351 
-0.351 0.117 

/ 


Earthquake e: 


0.960 

0.218 

0.082 

V 


0.218 

-0.843 

0.351 


0.082^ 

0.351 

-0.117 


(C-7b) 

0.95. 

(C-9a) 

Gaussian: 0.1%; Poisson: 4%. 

(C-9b) 

Gaussian: 0.3%; Poisson: 3%. 

(C-9c) 

Gaussian: 0.5%; Poisson: 2%. 

(C-lla) 

T = 21.8 yr; cr= 7.2 yr; Poisson: p = 37%; 
Gaussian: C{1993,1985) = 64%. 

(C-llb) 

t= 21.8 yr; G- 1.5 yr; Poisson: p = 37%; 
Gaussian: C(1993,1985) = 99%. 

(C-llc) 

t = 25.5 yr; (7=11.1 yr; Poisson: p- 31%; 
Gaussian: C(2018, 2010) = 82%. 

(C-lld) 

T— 27.2 yr; G= 14.7 yr; Poisson: p = 29%; 
Gaussian: Q2028,2020) = 74%. 


Chapter 5 

(la) 0.77 m;M w = 6.8, length = 31 km. 

(lb) 4.62 m;M w = 7.8, length = 240 km. 

(3a) 40 mm/yr. 

(3b) 125, 250, 500 yr. 

(3c) For 25%: 500,1000, 2000 yr; for 50%: 250, 500,1000 yr. 
(3e) M w = 8.4; M 0 = 5 x 10 28 dyn-cm. 

(5) 6 x 10 31 dyn-cm; M w 10.5. 

(7a) 47 mW/m 2 . 

(7b) 33 mW/m 2 . 

(7c) 84 mW/m 2 . 

(9) ~1 Ga. 

(11) -21.5 bar/°C. 

(13) vL 3 /(24K 2 t); 28 (torv= 10 cm/yr and t = 150 Ma). 

(17) 58°; 251 MPa. 

(C-lb) San Andreas: 46 mm/yr at 324°; Aleutian: 53 mm/yr at 346°. 
(C-3b) (0, 0, | a) |) = (-63.0°, 107.4°, 0.641 7my). 

(C-3c) Hawaii: 66 mm/yr at 299°. 


Chapter 6 

(la) a 0 -0,a k = 0, b k -(2Ik7c)(l-cos (kit)). 

(lb) a 0 = 0>a k = 0, b k = -c os (kx)/kn= (-1 ) k+l lkn. 

(3a) -1. 

(3 b) 4 i. 

(3c) -i. 

(3d) 1.5 + 2.67 

(7a) [5(o- co 0 ) - 8(co + co 0 )]. 

(9a) a 1 <7 1 ll J rb 1 a 1 v -\-2abo 1 uv . 

(9b) a z v 2 a 1 u + a 2 u 1 G 2 + 2a 1 uvo 1 uv . 

(9c) ( a 2 /v 2 )Gl + (a 2 u 2 lv 4 )G 2 v -2(a 2 ulv 3 )G 2 uv . 

(9d) a 2 b 2 u/ 2b ~ 2 ^G 2 u . 

(11a) vAt/(2 cos i). 

(lib) 10 km. 

(11c) (cr 2 Af 2 + g\ ( v 2 + G 2 v 2 At 2 tan 2 /)/4 cos 2 i. 

(lid) 4 km. 


Appendix 


(i) 

(5) 

(7) 

(13a) 

(13b) 

(13c) 

(13d) 

(15) 

(C-5) 

(C-7b) 

(C-9b) 

(C-llb) 

(C-13b) 


21 °. 

^ 11 ^ 22^33 ” ^ 11 ^ 23^32 “ ^ 12 ^ 21^33 + ^ 12 ^ 23^31 + a l3 a 21 a 32 ~ 
^ 13 ^ 22 ^ 31 * 

A-1 = ( -2/3 1/3 

( 5/6 -1/6 

6xy 2 + 2x 3 + 2. 

(0, 0,0). 

( 6y 2 + 6x 2 , 12 xy, 0). 

+ x 3 y 2 + y 2 + constant. 

Hint: use Eqn A.7.4. 

| (1,4,2) | = V21; I (2, 3,1) | = Vl4; sum = (3, 7, 3); a • b = 16; 
axb = (-2,3,-5); 0=21.1°. 

(0, -1,1). 


' 0.7 

-0.1 

0.3 

0.5 

-0.5 

0.5 

-0.8 

V 

0.4 

-0.2 

1,3,4. 




i: A = 93°,£=48°. 
ii: A = 54°, £=49°. 
iii: A=87°,C=148°. 
iv: A = 35°,C=232°. 
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absorption peak, 196 

acceleration (ground motion), 14-17,21-2 
accelerometers, 14,404-5 
accuracy, of estimates, 6-7,391-2 
acoustic impedance, 33, 77 
activation energy, 356 
activation volume, 356 
Adams-Williamson equation, 200-2 
adiabatic gradient (adiabat), 201,310 
aftershocks, 217,277 
air gun, as seismic source, 148-9 
Airy isostasy, 301n3 
Alaska earthquake (1946), 19,26 
Alaska earthquake (1964), 19,20,238-9,261-3 
source parameters, 265-6 
stress drop, 270 
aleatory uncertainty, 7 
Aleutian arc, 311 
aliasing, 386-7,405 
amplitude modulation, 94nl 
amplitude spectrum, 95,102, 373 
amplitude tomography, 433 
amplitude 

from ray densities, 100-1,160,169 
reflected and transmitted waves, 32-4,76, 85 
Andes, deformation rates, 341 
anelasticity, 185-6,190-4 
physical dispersion, 194-6 
physical models, 196-7 
angle of incidence 
for plane waves, 65-6 
in spherical earth, 157 
angle of internal friction, 351 
angle of sliding friction, 352 
angular frequency, 31 
angular order, 103 

animal behavior, earthquake precursor, 25 
anisotropy 

asthenosphere, 180-2 
azimuthal, 179 
composite structures, 180 
of core, 182-5 
definition, 177 
of lithosphere, 180-2 
mantle, 182-4 

of minerals and rocks, 179-80 
olivine crystals, 60,179 
transverse isotropy, 178-9 
velocities for, 178-9 


antipodal focusing, 169,170 
Appalachian Mountains, 182, 334 
apparent dip, 126,153 
apparent reflector, 153-4 
apparent velocity 
filtering, 145-6 
seismic waves, 65-6 
surface waves, 87-91 

Armenia earthquake (1988) see Spitak (Armenia) 
earthquake 

arrays, seismometer networks, 407 
aseismic deformation, 342 
aseismic slip, 262, 307,323-4, 340 
asperities, 273 

associated Legendre function, 103,106 
asthenosphere, 170,286 
anisotropy, 180-2 
viscous flow, 331,365 

Atlantic Ocean, intraplate earthquakes, 326-7 
atmosphere, evolution of, 288 
attenuation 
in crust, 197-8 
geometric spreading, 56,187 
inverse problem for normal modes, 434-7 
in mantle, 198 
physical dispersion, 194-6 
quality factor Q, 114,190-3,197-8 
seismic waves, 114,185-98,229-30 
spectral resonance peaks, 193-4 
attenuation operator, 196 
auto-correlation, 151,384-5 
auxiliary fault plane, 219 
axial high, mid-ocean ridge, 299, 305 
axial valley, mid-ocean ridge, 299,305 
azimuth, spherical coordinates, 463-5 
azimuthal anisotropy, 179 
azimuthal order of normal modes, 103 

fi-values, 274-7 

back azimuth, 456,464-5 

backarc basin, 307 

Balleny Islands earthquake (1998), 13, 328,347 
bandpass filter, 378-9, 383 
bar, pressure unit, 41 
basalt, 132 

Basin and Range, 130,293,334 
Bayes’s theorem, 279 
BCIS see Bureau Central International de 
Seismologie 
benchmarks, 25 lnl 
Benioff, H., 288 

Big Bear earthquake (1992), 253-4 


Birch, F., 119,201 

blind zone, refraction seismology, 123 
block slider model, 360 
body force, 39 

equivalent for earthquakes, 220,239,245 
body waves 
core phases, 166-9 
definition, 3 
lower mantle, 171-4 
magnitude, 264 
modeling, 231-5 
phases, 163-6 
radiation patterns, 220-2 
travel time studies, 162-76 
and upper mantle structure, 169-71 
visualizing, 174-6 . 
see also P waves; S waves 
Borah Peak earthquake, Idaho (1983), 293, 347 
borehole seismometer, 400,408,433 
boundary conditions 
core-mantle, 105n4 
different interfaces, 51-2 
reflection and transmission, 76, 79 
string waves, 33 
surface waves, 87,90 
bowtie structure, 153-4 
boxcar function, 231,381,383 
breathing mode, 106 
bridges, earthquake damage, 18 
Brillouin scattering, 179 
brittle fracture, 349, 352 
brittle-ductile transition, 357 
broadband seismometers, 403-4 
Browning, L, lln3 
buildings 

as damped harmonic oscillators, 194 
earthquake risks, 14-18 
bulk modulus, 50 
bulk sound speed, 200 
Bullen, K., 162nl, 201 

Bureau Central International de Seismologie (BCIS), 
398 

Byerlee’s law, 353 
Calavaras fault, 276 

California Strong-Motion Instrumentation 
Program, 410 

Carrizo Plain, San Andreas fault, 215,260 
Cartesian coordinate system, 445,455 
caustic, 160,169,188 
CDP see common depth point 
cell hit-count plot, 311 
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Central Indian Ocean ridge, 295, 329 
central limit theorem, 392 
centroid moment tensor project (CMT) see 
Harvard centroid moment tensor project 
Chagos-Laccadive ridge, 329n4 
Chang Heng, 400-1 
chaos theory, 24nl 1 
characteristic polynomial, 457 
Chile earthquake (1960), 11,110, 265, 266, 321, 
323 

Chilean subduction zone, 323 

China, earthquake prediction, 25-6 

circular fault, 269 

Clapeyron slope, 315-16 

CLVD see compensated linear vector dipole 

CMB see core-mantle boundary 

CMP see common midpoint 

CMT see Harvard centroid moment tensor project 

Cocos plate, 294-5 

coda, 189 

coefficient of internal friction, 351 
coefficient of sliding friction, 352 
coefficient of thermal expansion, 201 
cohesive strength, 351 
collision, between plates, 336-9 
common depth point (CDP) stack, 144 
common midpoint (CMP) stacking, 141-4, 

395-7 

common source point (CSP), seismograms, 141 
compensated linear vector dipole (CLVD), 245-6, 
250 

complex Fourier series, 371-2 
complex numbers, properties, 443-4 
composite structures, anisotropy, 180 
Comprehensive Test Ban Treaty (CTBT), 27 
compression, 40 

compressional (?) axis, 221, 226-7, 243-4 
compressional quadrant, 5,219 
compressional waves see P waves 
compressive stress, 40,351 
computers 

scientific programming, 466-74 
solving linear equations, 453-4 
Conrad discontinuity, 130 
constitutive equations, 38, 48-51 
continental drift, 9,286,295-6 
continental earthquakes, 333-48 
continental lithosphere, 287 
continental plates 
deformation, 339-43 
plate boundary zones, 334-9 
rifting, 333-4,343, 345 
continuum mechanics, 38 
convolution 
digital, 390-1 

in earthquake modeling, 229-34 
linear systems, 378, 379-80 
in reflection seismology, 150-1 
coordinate transformations, vectors, 455-6 
coordinates 

Cartesian coordinate system, 445,455 
polar, 443 
spherical, 462-6 
core of earth 
anelasticity, 198 
anisotropy, 182-5 
body wave phases, 166-9 
chemical composition of, 205,208-10 
inner core boundary (ICB), 162,209-10 
regions of, 162 


core-mantle boundary (CMB), I05n4,162, 
174 

density changes, 202 
temperature, 204,209 
Coriolis force, 114 
corner frequencies, 267,270 
coseismic displacement, 217,254-6 
Coulomb-Mohr failure criterion, 351,362-3 
coupling 

of earth’s modes, 114-15 
of P-SV waves, 64 
see also seismic coupling 
covariance, 394,421 
cracks, fluid-filled, 181 
critical angle, waves at interface, 67-8, 78, 
121 

critical distance, refracted waves, 121 
creep, 262 

cross product, 447-8 

cross-correlation, seismograms, 151, 383-5 
crossover distance, refraction seismology, 121 
crust 

anisotropy, 180-2 
attenuation, 197-8 

boundary with mantle (Moho), 122,130 
geological composition, 130-4 
refraction studies, 128-31 
CSP see common source point 
curl, vector fields, 460-1 

D" region 

composition of, 207-8 
structure of, 171-4 
temperature, 204 
damped harmonic oscillator 
model for anelasticity, 190-4 
model for seismometer, 398-9 
. damped least squares solution, 430 
damping factor, 190, 398 
dams, earthquake damage, 18 
data see seismic data 
data space inversion, 436-7,439 
decomposition 
matrix (Lanczos), 427,429 
moment tensor, 246,250-1 
vector field (Helmholtz), 54n3 
deconvolution 

earthquake source, 235,380 
linear systems, 80 
in reflection seismology, 148-51 
deformation 

coseismic, 254-9 
interseismic, 259-63 
measuring, 251-4 
permanent or transient, 342-3 
postseismic, 365 
regional, 364-5 
rheology, 349-50 
seismic or aseismic, 339-42 
theoretical models, 349-66 
degeneracy of normal modes, 104 
delay time, 181,232 
delta functions 

application to deconvolution, 150-1, 380 
Dirac, 375-7 
Kronecker, 449 

density, within the earth, 199-202 
deep earthquakes 
definition, 308 

relation to subduction, 310-21 


depth of earthquakes 
classification, 308 
determining, 6-7,232-4,238 
and lithospheric properties, 303, 357-62 
at ridges, 305 
at continental rifts, 334 
at subduction zones, 310,312,318 
depth of ocean, 301-3 
deviatoric stresses, 45-6 
DFT see Discrete Fourier Transform 
diagonalized stress tensor, 43 
differential interferogram, 253 
diffraction, 2, 72-5,153 
and core phases, 167-8 
diffraction hyperbola, 153 
diffraction sum migration, 153 
digital convolution, 390-1 
digital seismographs, 251, 404 
dilatation 
volume change, 48 
seismic first motion, 219 
dilatational quadrant, 5,219 
dip angle, 218 
dip filters, 147 

dip-slip faulting, 218,225-6,236,244,256, 
269-70 

Dirac comb, 385-6 

Dirac, Paul, 443nl 

Dirac delta function, 375-7 

direct wave, refraction seismology, 120 

direction cosines, 455 

directivity, 231 

Discrete Fourier Transform (DFT), 387-91 

dislocation, 265-6 

dispersion 

dispersive signals, 94-6 
geometrical, 96-9 
normal mode, 107-10 
physical, 96,194-6 
surface waves, 87,96-100,433 
tsunamis, 99-101 
dispersion relations, 90,107 
displacement 
potentials for, 54,63 
string wave, 30 
seismic wave, 53-7,63-5 
static (coseismic), 254-6 
distance, spherical coordinates, 464 
divergence, vector field, 459-60 
divergence theorem, 460 
Dix equation, 136 
Doppler effect, 231 
dot product, 446-7 
double-couple source, 220,240,242 
downward continuation, 155 
ductile flow, 355-7 
ductile materials, 349 
dunite, in mantle, 205 
dynamic friction, 359 
dynamic range of seismometers, 400 

earth 

anelastic structure, 197-8,437 

anisotropic structure, 177-85 

density, 199-202 

interfaces within, 75 

models, 62-3,119,162,202-3,434-9 

normal modes, 101-15 

pressure in, 202 

surface boundary conditions, 51-2 
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study of, 1,2-3,119-20 
temperature in, 202-4 
earthquakes 

acceleration from, 14-17, 21-2 

continental, 333-48 

damage caused by, 11-20 

deaths due to, 12-13 

deep, 308,310,312-21 

depth determination, 6-7,232-4,238 

distribution of, 9-10, 289 

epicenter, 4,416 

energy radiated, 10,11,273 

and faults, 215-16 

first motions, 219-20 

forecasting, 20-4,278-81 

frequency-magnitude relations, 274-7 

geodesy, 251-63 

hazard estimates, 14-15,21-2,346,379 
hypocenter, 4,217,251,416 
insurance, 14 

intensity of shaking, 14-17, 346 
intermediate depth, 308,320-1 
intraplate earthquakes, 11,271,288,303, 
326-48 

continental, 343-8 
list of notable, 12-13 
locating, 416-24 
location bulletins, 251, 398 
magnitude, 4,11, 263-6 
numbers of, 11,274-7 

mid-ocean ridge and transform, 298-9,305-7 

oceanic, 326-32 

plate boundaries, 288 

prediction of, 9, lln3,24-6 

probabilities, 21-4,278-81 

possible unpredictability of, 23-6,274,280 

real-time warnings, 26 

and regional deformation, 364-5 

risks, 11-14 

and rock friction, 359-64 
shallow, 308 
silent, 262 
slow, 271 
statistics, 274-81 
and strength of lithosphere, 357-9 
at subduction zones, 307-10 
in subducting slabs, 312-21 
swarms, 326,328 
trench, 321-5 
yearly energy release, 275 
East African rift, 333-5,343 
East Pacific rise, 299,305, 307 
Easter microplate, 307 
Eastern North America 
earthquakes, 14-17,343-7 
seismic attenuation, 17,197-8 
seismic hazards, 14-17,22,346-7 
eclogite, 133, 321 
effective stress, 353 
effective viscosity, 356 
efficiency, seismic, 273 

eigenfrequencies, 36-8,101-2,107-10,434-5, 
466 

eigenfunctions, 36,92,101-2,107 
eigenvalues, 426-9,456-8 
eigenvectors, 426-9,456-8 
Einstein, Albert, 279n5 
Einstein summation convention, 449 
elastic rheology, 48-51, 349 
elastic lithosphere, 304 


elastic moduli, 49-50,177-8 
elastic rebound, 21,215, 259-62 
elastic strain energy, 52,61-2 
elastic-perfectly plastic rheology, 349 
electromagnetic seismometers, 401-2 
endothermic phase transition, 206,316, 318 
energy 

flux in P-SV waves, 80-1 
flux in SH waves, 77-8 
in harmonic waves, 35-6 
in plane waves, 61-2 
radiated in earthquakes, 10,11,273 
strain energy, 52 
engineering seismology, 14-18 
epistemic uncertainty, 7 
epicenter, of earthquakes, 4,416 
epicentral distance, 163 
equal-area projection, 223n3 
equation of equilibrium, 47, 314, 330 
equation of motion, 38,46-7 
equivalent body forces, 220,239-45 
error ellipse 

earthquake location, 422,424 
Euler poles, 344,440 
errors 

earthquake location, 7,420-2 
propagation of, 393-4 
random, 7,392-5 
systematic, 7, 392 
Euler pole, 290-1 
Euler vectors, 290-5, 326 
inverse problem for, 439-40 
for NUVEL-1A model, 294 
Euler’s theorem, 290nl 
evanescent wave, 68, 78 
excitation amplitudes, 102, 111 
excitation functions, 236 
exothermic phase transition, 206,315-16, 318 
exploding reflectors experiment, 152 
exploration seismology, 3-5,134-57 
explosions, as seismic sources, 245 

far-field motion, 259-60 
failure line, 351 

Fast Fourier Transform (FFT), 389-90 
“fault strength” paradox, 363 
faults 

analytical representation of geometry, 228-9 
blind faults, 256 

body wave radiation pattern, 220-2,232-3 
dip-slip, 218,225-6,236,244,256,269-70 
and earthquakes, 4-5, 215-17 
geometry of, 217-19 
heat flow, 363 

normal, 45,218,225-7,236,244,298-9, 328, 
334,336 

reverse, 45,218,225-7,236,244 
rupture propagation, 230-1,238-9 
seismic cycle, 217,259-63 
shear stresses, 40-5,350-3 
slip, 218,230-1,242,254-62 
static displacements, 254-6 
stereographic projection, 223-8 
stick-slip, 359-64 
stress direction, 44-5,226-7,345 
strike-slip, 45,218,225-7,236,244,254-5, 
269,298-9,328 

surface wave radiation pattern, 235-6 
thrust, 45,218,225-7,236,244, 328,336 
transform, 286,298, 305-7 


FDSN see Federation of Digital Broad-Band 
Seismographic Networks 
Federation of Digital Broad-Band Seismographic 
Networks (FDSN), 398,408 
Fermat’s principle, 70-2, 74,122,188-90 
Fernandina caldera, Galapagos Islands, 

276 

Feynman, Richard, 9nl 

FFT see Fast Fourier Transform 

filtering 

anti-aliasing, 405 
bandpass, 378-9, 383 
signals, 369 
tau-p, 147 

velocity (dip), 145-7 
finite impulse response (FIR) filter, 405 
finite signal length, 380-3 
FIR see finite impulse response 
fire, caused by earthquakes, 18-19 
first motions, earthquakes, 219-22,239 
flatness matrix, 430 
focal hemisphere, 222 
focal mechanism, 219-29,235-9 
deep earthquakes, 312-14 
intermediate earthquakes, 312-14 
ridge-transform earthquakes, 298-9 
focus, of earthquake see hypocenter 
foot wall block, 218 
football mode, 106 
force couple, 241-2 
force-feedback seismometer, 403 
forecasting, earthquakes, 20-4,278-81 
foreshocks, 25 

Fort Tejon earthquake (1857), 22,279, 334 
Fortran, for scientific programming, 467 
forward problems, 6,415 

410-km discontinuity, 163-4,170-1,202,205-6, 
315-16,395 
Fourier analysis, 369-70 
linear systems, 377-85 
Fourier series, 370-2 
Fourier transform, 94-5,229,372-5 
delta functions, 375-7 
Discrete Fourier Transform (DFT), 387-91 
double Fourier transform, 146 
Fast Fourier Transform (FFT), 389-90 
finite length signals, 380-3 
inverse Discrete Fourier Transform (IDFT), 
388-9 

properties of, 374-5 
spatial, 145-6 
fractal scaling, 274 
fractional crystallization, 209nll 
fracture, of rocks, 349-54 
fracture strength, 349 
fracture zone, 298 
free oscillations, 36,101 
frequency domain, 373 
frequency response, seismometer, 402 
frequency-magnitude relations, earthquakes, 
274-7 

frequency-time domain equivalence, 229,235, 
373-4 

Fresnel zone, 166,188 
friction, and earthquakes, 359-64 
friction and fracture, in rocks, 350-4 
fundamental modes 
Fove wave, 91-2,96 
spheroidal, 106 
torsional, 105 
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gabbro, 130-4, 321 

gain, seismometer, 402 

Gaussian distribution, 279-80,392-5 

Gaussian elimination with partial pivoting, 454 

Gaussian pill box, 51-2 

Gauss’s theorem, 460 

GDSN see Global Digital Seismic Network 

geodesy 

combined with seismological data, 256-9 
coseismic deformation, 254-6 
interseismic deformation, 259-63 
space-based methods, 251-4 
plate motions from, 295-6 
geoid, 303 

geological composition, crust and upper mantle, 
130-4 

geology, effect on earthquake damage, 18 
geometric ray theory, 70-2 
geometric spreading, 56,160,187 
geometrical dispersion, 96-9 
geophones, 141 
GEOSCOPE network, 407 
geotherm, 202-4 

oceanic lithosphere, 302 
and rock strength, 357-8 
subducting slab, 309 

Gilbert Islands, earthquake swarm (1981-3), 328 
glacial loads, removal, 346 
Global Digital Seismic Network (GDSN), 407 
global plate motions, 8, 293-5 
Global Positioning System (GPS), 251-2,296, 323, 
336-41,344-5,365 
Global Seismic Network (GSN), 407-8 
Gloria transform fault, 326 
GPS see Global Positioning System 
gradient, vector field, 459,465 
Grand Banks earthquake, Newfoundland (1929), 
241,346-7 
granite, 133,359 
gravimeter, 403 
gravity within earth, 199-202 
grazing incidence, 78, 80 
great circles, 463-4 
Green’s function, 235,246-7,380 
Greenwich meridian, 463 
ground motion from earthquakes 
acceleration, 14-17,21-2 
intensity, 14-17,246 
ground roll, 141 

groundwater, earthquake precursor, 25 
group velocity, 94-7 
Guatemala earthquake (1976), 235 
Gulf of Aden, 333,334 
Guralp-3T seismometer, 404 
Gutenberg, Beno, 274n2 
Gutenberg-Richter relation, 274-7 

Haicheng (1975) earthquake, 12,25 
halfspace model, oceanic lithosphere, 302 
half-spreading rate, 300 
hanging wall block, 218 
harmonic oscillation, damped, 190-4, 398-9 
harmonic waves 
definition, 31-2 
energy in, 35-6 
plane wave, 55 
harmonics, spherical, 103-4 
vector spherical, 105-6 

Harvard centroid moment tensor project (CMT), 
251,266 


Hawaii 

intraplate earthquakes, 327-8 
tsunamis, 19,26 

Hawaiian-Emperor seamount chain, 297 
hazards, definition, 11 
head wave 

amplitude, 128 
dipping layers, 123-6 
flat layers, 121-3 
heat engine model of earth, 287 
heat flow 
oceanic, 301-3 
on faults, 363 

Hebgen Lake earthquake, Montana (1959), 

347 

Hellenic trench, 339,342,431 
Helmholtz decomposition, 54n3 
Herglotz-Wiechert integral, 161-2 
highways, earthquake damage, 18 
Hilbert transform, 166n3 
Himalayas, 336 

homogeneous equation of motion, 47,53-4 
homogeneous medium, earthquake location, 
419-20 

Hooke’s law, 49,177 

horizontal slowness, 69,137 

hot spots, 297,327,347-8 

Huygens’ principle, 72-5,122,153,189 

hydrophones, 141 

hydrostatic pressure, 354 

hypocenter, of earthquakes, 4,217,251, 416 

IASP91 earth model, 162-4 
ICB see inner core boundary 
IDA see International Deployment of 
Accelerometers 
identity matrix, 451 

IDFT see inverse Discrete Fourier Transform 
ilmenite, 205 

imaginary numbers, 443-4 

impedance, 33,77, 83 

impulse response, 377 

IMS see International Monitoring System 

incompressibility, 50 

Incorporated Research Institutions for Seismology 
(IRIS), 398,403 

Global Seismographic Network (GSN) program, 
407-8 

index notation, 38nl, 448-9 
India-Eurasia plate collision, 336 
Indian Ocean, earthquakes, 328-9 
infinitesimal strain theory, 49 
inhomogeneous wave, 78 
inhomogeneous wave equation, 56 
inner core boundary (ICB), 162,209-10 
InSAR see Synthetic Aperture Radar interferometry 
intensity of shaking, 14-17, 346 
intercept-slowness (tau-p) 
filtering, 147-8 

formulation for travel time, 137-40 
interfaces 

boundary conditions, 51-2 
in the earth, 75 

SH reflection and transmission at, 76-8 
P-SV reflection and transmission, 81-6 
Snell’s law, 66-8 
interferometry, 252-4 
intermediate depth earthquakes 
definition, 308 

relation to subduction, 310-21 


Intermountain Seismic Belt, 347 
internal friction, 186 

International Deployment of Accelerometers (IDA), 
403-4 

International Monitoring System (IMS), 28,408 
International Seismological Centre (ISC), 
earthquake bulletins, 251,398 
International Seismological Summary (ISS), 398, 
407 

interplate earthquakes, 288 
mid-ocean ridges, 298-9, 305-7 
trench, 321-5 

interseismic motion, 217,259-63 
intraplate earthquakes, 288 
continental, 343-8 
oceanic, 326-32 

intraplate earthquakes, 11,271,288,303,326-48 

intraplate stress field, 331, 345 

intrinsic attenuation, seismic waves, 185,190-8 

inverse Discrete Fourier Transform (IDFT), 388-9 

inverse filters, 150-1,380 

inverse Fourier transform, 95,372 

inverse problems 

earthquake location, 416-24 
migration as, 153 
plate motions, 439-41 
solving, 6,415-16 
stratified earth structure, 434-9 
surface wave dispersion, 96-9 
travel time tomography, 424-34 
inverse theory, 415-19 
Iran earthquake (1990), 11 
IRIS see Incorporated Research Institutions for 
Seismology 
isoseismals, 15 
isostasy, 301 
isotherms, 300 
isotropy, 50,177 

ISS see International Seismological Summary 
Izmit earthquake, Turkey (1999), 13,339, 363 

Jackson, David, 9n7 

Jamaica earthquake (1692), 20 

Japan 

earthquake prediction program, 9 
regional networks, 410 
seismicity, 322n5 
Jeffreys, Harold, 9,162nl 
Jeffreys-Bullen (JB) earth model, 162 
in travel time tomography, 430-2 
joint hypocenter determination, 424 
Juan de Fuca plate, 291,293,295 

Kalapana earthquake, Hawaii (1975), 327-8 

Kansu earthquake, China (1920), 20 

Kepler, Johannes, 110n7 

kernels, 434-7 

Kirchoff migration, 153-5 

Klauder wavelet, 151 

Kobe earthquake, Japan (1995), 9,13,18 

Kronecker delta, 449 

Kuhn, Thomas, 9 

Kuril subduction zone, 323 

Lg phase, 197 

Labrador Sea, 327 

Lame constants, 50 

Lanczos decomposition, 427,429 

Landers earthquake (1992), 12,253-4,282, 

293 




landslides 

caused by earthquakes, 20 
as seismic sources, 241 
Lapiacian, vector field, 461-2,466 
Large Aperture Seismic Array (LASA), 409 
lateral spreading, 20 

lattice-preferred orientation (LPO) anisotropy, 

177 

layered medium 
dipping, 123-6 
as earth model, 62-3 
plane waves in, 62- 8 6 
refraction seismology, 120-3 
least-squares solution, 418 
left-lateral slip, 218 
Legendre polynomials, 103 
Lehmann, Inge, 168n6 
Lesser Antilles, travel time tomography, 432 
lid see seismic lithosphere 

light, analogies for seismic waves, 2, 32n3,38n6, 
56-7,61nl, 67,70n2,74,96,185,189n3, 
194n5,231n2 
linear elasticity, 48 
linear superposition, 34, 377 
linear systems, 377-85 

convolution and deconvolution, 379-80 
linear vector space, 450 
linear velocity, plate motions, 290 
liquefaction, caused by earthquakes, 20 
lithosphere, 170,286-7 
anisotropy, 180-2 
strength of, 357-9 
see also oceanic lithosphere 
lithostatic stress, 45-6 

Loma Prieta earthquake (1989), 7,12-13,15,24, 
282,293 
aftershocks, 111 
damage caused by, 13,18 
ground motion, 62 
liquefaction, 20 
source parameters, 265-7 
Long Beach earthquake (1933), 17 
Long Valley caldera, 246, 334 
longitudinal waves, 57 
Love, A. E. H., 86nl, 110 
Love waves 

definition, 86-7 
dispersion, 91-3,96-7 
focal mechanisms, 235-9 
layer over halfspace, 90-3,102 
and torsional modes, 107,109 
low-velocity zone (LVZ), 99,170,204,303,358, 
435 

lower focal hemisphere, 222 
lower mantle see mantle 

LPO see lattice-preferred orientation anisotropy 
lunar seismology, 210-11 
LVZ see low-velocity zone 

magma chamber, 186,246,305 
magnesiowustite, 205,316 
magnetic reversals, 293 
magnitude, earthquake 
body wave, 264 
of earthquakes, 4,11,263-6 
frequency-magnitude relations, 274-7 
local, 263 
moment, 266 
and radiated energy, 273 
saturation, 265,268 


surface wave, 264 
uncertainties in, 7,266 
magnitude of vectors, 446 
mainshock, 277 
mantle 

anisotropy, 182-4 
attenuation, 198,435-7 
boundary with crust (Moho), 122,128-31 
chemical composition of, 204-7 
convection system, 286-7 
discontinuities, 170-1 
lower mantle structure, 171-4 
regions of, 162 
temperature of, 204 
upper mantle structure, 169-71 
viscosity of, 331-2,350,355-7 
see also core-mantle boundary; D" region 
mantle plume hypothesis, 297 
mantle waves, 251 
Mars, 211,287-8 

master event methods, earthquake location, 424 
mathematical techniques, 443 
matrix 

adjoint, 451 
cofactor, 452 

computer solutions of linear equations, 453-4 
diagonalization and decomposition, 426-7, 

456-7 

definitions, 450-1 
determinant, 451-2 
eigenvalues and eigenvectors, 456-7 
generalized inverse 247,418,426-7 
Hermitian, 451 
identity, 451 
invariants, 43,457 
inverse, 452 
linear equations, 452-3 
orthogonal, 452 
symmetric, 451 
transpose, 451 

maximum time path, 71-2,164-6 
Maxwell relaxation time, 355-6 
Maxwell viscoelastic material, 355 
mean recurrence time, 278 
Mediterranean collision zone, 337-9 
megaton, energy unit, 11 
Mercury, 211 
meridians, 463 
metastability, 3l6n3 
meteor impacts, as seismic source, 241 
meteorites, composition of, 209 
mesosphere, 319 

Mexico City earthquake (1985), 12,18 
mica, anisotropy, 179-80 
microplates, 307 
microearthquakes, 299, 305 
microseismicity, 25 
microseisms, 400 
Mid-Atlantic ridge, 298-9 
mid-ocean ridges, 286-8,298-9,305-7 
migration, reflection seismology, 152-6 
Millikan, R., 7 
minerals 

anisotropy, 179-80 
crust and upper mantle, 130-4 
phase changes and deep earthquakes, 317-18 
phase changes and intermediate earthquakes, 321 
in subducting slabs, 315-17 
in transition zone, 205-7 
minimum time path, 71-2,164-6 


mode-wave duality, 101 
model resolution matrix, 427 
models 

of the earth, 62-3,119,162 
in inverse problems, 415-16 
plate motions, 8,293-4 
use of, 5-9 

modes, normal see normal modes 
Modified Mercalli scale, 14-16 
Moho (Mohorovicic discontinuity), 75 
discovery of, 122 

reflection and transmission at, 82-4,122 
geological composition, 130-1 
waveforms, effects on, 127-8,130 
Mohorovicic, Andrij a, 122 
Mohr envelopes, 353 
Mohr’s circle, 350-4 
moment, see seismic moment 
moment magnitude, 266,273 
moment tensor, seismic, 239-51 

compensated linear vector dipoles (CLVDs), 

245-6 

interpretation of, 249-51 
inversion, 246-9 
isotropic, 245 
stress (P, T, B) axes, 243-4 
moon 

moonquakes, 210 
scattering of seismic waves, 189-90 
velocity structure, 210-11 
Mt. St. Helens, explosion, 240-1 
moveout, 134,142 

multichannel data geometry, reflection seismology, 
140-1 

multipathing 

seismic waves, 185,187-8 
tsunami, 100 

multiplet, 104,114-15,184,194, 388 
music of the spheres, 110n7 

namazu (catfish), 322n5 
Nankai trough, 322-3 
Nazca plate, 144, 305, 307,339-40 
networks see seismological networks 
New Madrid earthquakes (1811 & 1812) and 
seismicity, 12,14-16,274,343-6 
Newtonian fluids, 356 

Newton’s second law of motion, 29, 38,47,101 

Niigata earthquake (1964), 20 

Ninetyeast ridge, 329 

Nisqually earthquake (2001) 334 

NMO see normal moveout 

nodal lines, 104-6 

nodal planes, 219,222,224,226-9 

nodal surfaces, 105-7 

nodes, 92,105 

noise, in seismograms, 141,145, 369,383,395,400 
noncausality, 195,378,405 
nonlinear tomography, 433 
normal fault earthquakes, 218-19,298-9, 307, 
328-9,334 
surface waves, 236 
normal modes of the earth, 101-15 
attenuation, 114,434-7 
dispersion, 107-9 
inverse problem for, 434-7 
radial, 106 
of a sphere, 101-11 
spheroidal, 106-10 
splitting, 114-15 
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normal modes of the earth ( continued) 
synthetic seismograms, 111 
torsional, 104-10 
traveling wave equivalence, 106-10 
normal modes of string, 36-8 
normal moveout (NMO), 134,142 
NORSAR see Norwegian Seismic Array 
North American plate 
absolute motion, 297-8 
boundary with Pacific plate, 291-3, 334 
intraplate earthquakes, 343-7 
relative motion, 294 
rigidity, 344 
stress field, 347 

North Anatolian fault, 339,363 
Northridge earthquake (1994), 13,14, 257-8, 282, 
363 

Northwest Pacific, subduction zones, 319 
Norwegian Seismic Array (NORSAR), 409 
nuclear explosion, source, 241, 245 
nuclear testing, monitoring of, 26-8,198,407,409 
null (B) axis, 220-2,226-8,243-4 
NUVEL-1A global plate motion model, 293-6, 
326,441 

Euler vectors, 294 
Nyquist frequency, 386-8 

oblique convergence, 321-3 
oblique slip, 225 
oblique spreading, 299 
ocean bottom seismometers (OBS), 408,409 
oceanic crust 
anisotropy, 180 
mineralogical transitions, 321 
refraction studies, 129 
structure, 129 

oceanic earthquakes, intraplate, 326-32 
oceanic lithosphere 
age of, 287 

anisotropy, 180-1,182 
evolution of, 299-305 
forces and stresses, 328-31 
oceans, evolution of, 288 
Oldham, Richard, 167n5 
olivine 

anisotropy, 60,179 
in mantle, 205 
metastable wedge, 316-18 
spinel transition phase, 311 
strength of, 357, 358 
Omori, Fusakichi, 277n3 
Omori’s law, 277 

one-dimensional scalar wave equation, 30 
origin time, 1,416 
orthogonal transformations, 456 
outlier earthquakes, 319 

overdetermined system of equations, 99,247,417, 
425 

overtones, 91,105 

P waves, 3,56-61 
at interfaces, 81-6 
body wave phases, 107,164-6 
core phases, 166-9 
critical angle, 67-8 
displacement equations, 63-5 
equations, 53-4 
first motions, 219 
in layered medium, 63-8 
ray parameter, 69-70 


reflection and transmission; at an interface, 81-4; 

at a free surface, 79-81 
refraction, 122-3,127-8 
Snell’s law, 66-7 
SV waves coupling, 57 
transverse isotropy, 179 
velocity, 58-9 
waveform modeling, 231-3 
Pacific Plate motion, 294 
palaeomagnetism, 293 
palaeoseismology, 22-3,28 
Pallett Creek, earthquake forecasting, 22-4, 281 
Palmdale Bulge, 25,28 
paradigm shifts, 9 

parameter space inversion, 436,438-9 
Parkfield, earthquake forecasting, 23, 281 
Parseval’s theorem, 375 
particle motion plot, 58, 89, 182,189 
passive margins, 334 
perfect fluids, 50 
period, 31-2 
period equations, 90 
permutation symbol, 449 
perovskite, 205,208,316 
Peru trench 
seismic section, 144 
tectonics, 339-41 
phase nomenclature, 164 
phase spectrum, 95, 373 
phase velocity, 94, 97-9,107,195 
physical dispersion, seismic waves, 96,194-6 
plane wave decomposition, 147 
plane waves, 54-5 
energy in, 61-2 
in layered medium, 63-5 
Snell’s law, 66-75 
see also P waves; S waves 
planetary evolution, 210-11 
plastic deformation, 349 
plate boundaries, 286-7,333-4 
plate boundary zones 
continental plates, 334-9 
faults, 260 
plate dynamics, 288 
plate motions, 288,290-8 
absolute plate motions, 296-8 
continental plates, 334 
global plate motions, 8,293-5 
inverse problem, 439-41 
relative plate motions, 290-3 
space-based geodesy, 295-6 
plate model, oceanic lithosphere, 302-3 
plate tectonics, 5,286-90 
continental earthquakes, 333-48 
oceanic intraplate earthquakes, 326-32 
plate kinematics, 290-8, 334 
spreading centers, 298-307 
subduction zones, 307-25 
point sources, 72,152-3,231 
Poisson distribution, 278, 280 
Poisson solid, 51 
Poisson’s ratio, 51 
polar coordinates, 443 
polarization, shear waves, 57-8,178 
poloidal modes see spheroidal modes 
pore pressure, 353-4 
postseismic phase, 217 
potentials, 54 
power spectrum, 384 
precision, of estimates, 391-2 


precursors 

earthquake predicting, 24-6 
from FIR filters, 405 
to PR? phase, 168 
to ScS phase, 172-3 
to SS phase, 395 

PREM (Preliminary Reference Earth Model), 162, 
171 

density structure, 202 
model parameters, 203 
preseismic stage, 217 
pressure 

earth profile, 202 
effect on rocks, 349 
hydrostatic, 200 
lithostatic, 50 

principal stresses, 42-3,227, 350 
probability, assessment of, 7,278-81,441 
probability density distribution, 278 
propagation of errors, 393-4 
pure path method, variable velocity measurement, 
98-9 

pyrolite, 205 

quality factor Q, wave attenuation, 114,190-1, 
192-3,197-8,229-30,434-5 
quartz 

in rocks, 132 
strength of, 357, 358 

radial component, 57-9 
radial earth model, 162 
radial order, modes, 102 
radiation patterns 
body waves, 220-2 
surface waves, 236-9 
radon gas, earthquake precursor, 25 
Radon transform, 147 
ramp function, 372 
ray parameter 
definition, 69-70 
layered medium, 134-5 
spherical earth, 157 
ray paths 

dipping interface, 123-6 
low-velocity zone, 161 
seismic waves, 65,120-1 
spherical earth, 157-9 
velocity increase, 160 
ray theory, see geometric ray theory 
Rayleigh waves 
definition, 86-7 
dispersion, 96, 98-9 
focal mechanisms, 236-9 
in homogeneous halfspace, 87-9 
inversion, 438 

relation to spheroidal modes, 106-7,110 
real-time data, 408 
real-time warnings, earthquake, 26 
receiver function, 380 
reciprocity, principle of, 38,122 
recurrence time, 278 
Red Sea, 333, 334 
reflection coefficients 
P-SV waves, 79-84 
SH waves, 76-8 
string waves, 33 
reflection seismology 

common midpoint (CMP) stacking, 141-4 
deconvolution, 148-52 



examples, 85-6 

intercept-slowness formulation, 137- 
migration, 152-6 

multichannel data geometry, 140-1 
principle of, 2 
signal enhancement, 145-7 
travel time curves, 134-7 
reflectivity method, 127 
refraction 
critical, 67 
definition, 2,67 
refraction seismology 
crustal structure, 128-31 
dipping layer method, 123-6 
flat layer method, 120-3 
principle of, 2 
regional networks, 410 
Reid, H., 215 
relative plate motion, 290 
relaxation time, 191, 355 
reservoirs, cause of earthquakes, 18 
residual vector, 418 
residuals, travel time, 420 
resolution, model, 415 
resolution matrix, model, 427 
resonance curve, 193-4 
resonant frequency, 193 
resonant period, of buildings, 17 
reverse fault, 45,218,225-7,236, 

244 

rheology, 349-50 
Richter, Charles, 274n2 
Richter scale, 263 
ridge push, 328,330 
rifts, continental, 333,334 
right-lateral slip, 218 
rigid body rotation, 47 
rigidity, 50 
ringwoodite, 205 
rise time, 230,267 
risks, seismic, 11-14 
rocks 

anisotropy, 179-80 
crust and upper mantle, 130-4 
fracture and friction, 350-4 
friction, 359-64 
strength of, 357-9 
viscosity, 355-7 
rupture 

geometry, 269-70 
direction, 230,231 
process, 258-9 
propagation, 238-9 
time, 230,267 
velocity, 230 

S waves, 3,56-61 

body wave phases, 164-6 
core phases, 166-9 
displacement equations, 63-5 
equations, 53-4 
motion, 221-2 
radiation pattern, 220-2 
ray parameter, 69-70 
transverse isotropy, 178-9 
velocity, 59 
visualizing, 174-6 
SH waves 
definition, 57 

reflection and transmission, 76-8 
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Snell’s law, 68-9 
waveform modeling, 233 
SV waves 

energy flux, 80-1 
P wave coupling, 57 
reflection and transmission, 79-84 
Snell’s law, 66-7 
Sacks, Selwyn, 149n5 
sampling 

and fi-values, 275,277 
and earthquake probability, 278 
cell hit count, 311-12 
of continuous data, 385-7 
San Andreas fault, 5,11,21 
aseismic slip, 262 
6-values, 276 

earthquake probability, 279 
and earthquakes, 215,260 
“fault strength” paradox, 363 
heat flow across, 363 
interseismic deformation, 260 
interseismic motion, 260 
locking, 263 
palaeoseismology, 22-3 
plate boundary zone, 334 
plate movements, 293 
seismic gaps, 23-4 
space geodesy, 296 

San Fernando earthquake (1971), 12,18,265,267, 
334,363 

seismograms, 406-8 

San Francisco earthquake (1906), 11-12,215 
damage caused by fire, 18 
seismoscope recording, 401 
source parameters, 265,266,267 
stress shadow, 363 
sand blows, 20 

Sanriku earthquake (1896), 12,19 
SAR see Synthetic Aperture Radar interferometry 
Satellite Laser Ranging (SLR), 251,296 
satellites, use in geodesy, 251-4,296 
scalar wave equation 
one-dimensional, 30 
three-dimensional, 54-5 
scalars 

definition, 444-5 
scalar fields, 458-9 
scalar product, 446-7 
scale invariance, 274-5 
scaling relations, seismic sources, 268-9 
scattering 

PKP precursors, 169-70 
seismic waves, 189-90 
Schmidt projection, 223n3 
sectoral harmonics, 104 

SEED see Standard for the Exchange of Earthquake 
Data 

seismic coupling, 323-4 
seismic cycle, 217,259-63 
seismic data 
networks, 407-12 
publication of, 251, 398 
sampling of, 385-7 
seismic efficiency, 273 
seismic energy, 61-2,273 
seismic gaps, 23-4,280-1,323 
seismic hazards and risks, 11-14 
seismic intensity, 14-16 
seismic lithosphere (lid), 170,304,435,437 
seismic moment, 4,221-2,265,273,305 


seismic moment tensor, 240,241,242-4 
seismic parameter, 200 
seismic phases, P waves: 
antipodal phases, 169-70 
P, 3,163-5,232-4,396,433 
P coda, 190 

PcP, 109,163-4,166,195 
PcPPKP , 170 

P diff {also Pd), 58,86,109,164,167-8 
PdP, 172-3 
P ,122-3 

P-„ 123 

P 2 ,129 

P 3 ,129 
P 3 P,129 
P,P, 123 
PKI1KP, 169 

PKiKP,5 8,163-4,167-9,409 

PKIKP, 110,165,167-9,184,198 

PK/KP, 107,110,165,169 

PKKP, 58, 86,164,169 

PKP, 109,163-4,167-9,189,433 

PKP 2 ,167 

PKP-AB , 167-9 

PKP-BC, 167-9,184 

PKP-DP, 167-9,184,210 

PKP precursors, 168-9 

PKPPKP, 169 

P m P, 122-3,127-9 

P n , 122-3,127-31,180-1 

Pnl ,123 

pP, 3,163-6,232-4, 396 

PP, 3,58, 86,163-7,169, 396 

P'P', 164,169 

pPcP, 165 

pPcPSKKP, 165 

P^diffi 58 

pPKP, 164 

pPP, 163,166 

PPP, 164-5,167 

pSP ,58 

PJ, 234 

ScP, 163-4,166 
SKiKP , 163-4 
SKP, 58,163-5 
SKPPKP, 165 
SKKP, 58,164-5 
sP, 163-6,232-4,395 

s Pdtfp 5 ^ 

SP, 86,164-5 

V,234 

seismic phases, S waves: 

PcS,163,166 
PKS, 58 
PKKS, 164 
PPS, 165 
pPS, 58 
PS, 58,164 
pSKS, 164 

S, 3,108,163-4,176, 383-4,433 
Sbc, 173 
Scd, 173 

ScS, 3, 85,108-10,112,163-6,176,182,193, 
195,384,433 
ScSp, 85 

ScS 2 , 3,108,165-7,176 
ScS 3 , 3,176 
ScS 4 , 3 
ScS 72 qS, 176 
ScS 400 S, 176 
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seismic phases, S waves ( continued ): 

ScS 67Q S, 17 6 

S diffy 58, 86,108-9,176,182-3 

S diff200 S diff> 176 
S diff400 S diff> 176 
^diff67Q^diff’ i 7 ^ 

SdS,172-3 
Sg, 122 

5165,58,86,109,163-5,168,174,181-2 

SKKS,58,86,163-5,168 

SKKKS, 1 68 

S„, 122 

5M65,174 

sP5,165-6 

sS, 3, 108,163-4,176 

s5c5,3,108,176 

sScSl, 3 

sScS3, 3 

s5c54,3 

55, 3,58, 86, 108,163-7,176,383-4,395,433 
sSKS, 164 

555,3,86,108,164-6,176 
sSS, 108 

54.3.86.164.166.176 

5 220 5.176 
^400^? 1 7 6 
5 4 i 0 5, 395 
^520^ 395 
5 660 5, 395 
^670^>1 76 

55.166.176 
seismic ray, 6, 65 
seismic section, 143-4 
seismic slip 

continental deformation zones, 340-1 
subduction zones, 323-4 
seismic sources, 1,4,141 
air gun, 148-9 

double couples, 220,240,242 
equivalent body force, 220,240 
exploding reflectors, 152 
explosive, 245 
force couples, 241-2 
isotropic, 245,247 
magnitudes and moment, 4, 263-6 
moment tensors, 239-51 
moving point, 231 
point, 56,152-3 
scaling relations, 268-9 
single forces, 240-1 
spectra, 266-8 
stress drop, 269-73 
time function, 222,230, 235,258-9 
Vibroseis unit, 149-52 
seismic spectrum, 59 
seismic strain rate tensor, 341 
seismic velocity see velocity 
seismic waves 
attenuation, 114,185-98 
introduction to, 1-3 
in layered medium, 62-86 
phases, 2-3 
plane waves, 54-5, 61 
ray paths, 2, 65,120-1 
seismic wave equation, 53-4 
signal filtering, 369 
in spherical earth, 157-62 
spherical waves, 55-6 
string waves, 29-38 


travel times, 60-1,119-20 
see also body waves; surface waves 
seismicity 

deep, 289-90,312, 319-22 
geographic distribution, 9-10,289 
temporal distribution, 11,274-6 
seismograms 

common midpoint (CMP) stacking, 141-4 
cross-correlation, 383-5 
data processing sequence, 156-7 
data sampling, 385-7 
digital, 405-7 
Fourier analysis, 369-77 
introduction to, 1-3 
linear systems, 377-85 
mode observation, 110 
multichannel, 140-1 
P and S waves, 57-8, 60-1 
receivers, 141 
record sections, 122 
rotated, 57-8 
stacking, 391-7 
synthetic, 111,383 
seismological networks 
arrays, 407,408-9 
global, 407-8 
regional, 407, 410-12 
seismometers, 1 
analog, 401 
arrays, 408-9 
broadband, 404 

damped harmonic oscillator, 190-1, 398-9 
digital, 403-7 
earth noise reduction, 400 
electromagnetic, 401-2 
force-feedback, 403 
IDA gravimeter, 403 
networks, 407-12 
ocean bottom, 408 
response, 229-30, 379,401-2 
strainmeter, 404,406 
Streckheisen, 404 
strong-motion, 404 
time recording, 405 
types of, 141, 385,398,400-5 
WWSSN, 402-3 
Wood-Anderson, 263 
seismometry 
definition, 398 
development of, 401 
seismoscopes, 400-1 
self-similarity, 274 
shadow zone, 161,167-8 
Shah function, 385 

shape-preferred orientation (SPO) anisotropy, 177 

shear modulus, 50 

shear stresses, 40,43-5, 350-1 

shear wave splitting, 181-2 

shear waves see S waves 

shift theorems, 374 

shock wave, from explosion, 245 

Sierra Nevada, crustal structure, 129-30 

signal enhancement, reflection seismology, 145-7 

signal processing, 369 

signals, finite length, 380-3 

silent earthquakes, 262 

sine function, 73, 381, 383 

single forces, 240-1 

single-couple source, 241 

singlets, normal mode, 104,114 


660-km discontinuity, 163-4,171,202-3 205- 
315-20 

slab pull, 314, 315, 324, 330 

slabs, subducting, 286, 308-21 

slant stacks, 147,396 

Slichter mode, 106 

slider model, 360 

sliding, stick-slip, 359-64 

sliding friction, 352 

slip 

aseismic, 262, 324, 340 
at faults, 218,254-6,262-3 
seismic, 323-4 
slip partitioning, 322 
slip vector, 218, 228,243,293 
slow earthquakes, 271 
slowness 
horizontal, 69 
intercept-slowness, 137-40 
and ray parameter, 69-70 
vector, 69 
vertical, 69 

SLR see Satellite Laser Ranging 
slump earthquakes, 241,346-7 
Snell’s law 

and Fermat’s principle, 71 
and Fluygens’ principle, 72-3 
P-SV waves, 66-7 
SH waves, 68-9 
in spherical earth, 157-9 
SNREI earth, 111-15 

SOFAR channel (SOund Fixing and Ranging), 70 
solidus, 204,209 
source location, 419-20 
source time function, 222, 230,235,258-9 
South American plate, deformation, 339-40, 
342-3,365 

Southern California Seismographic Network, 410 

Soviet Union, nuclear testing, 26-7 

spatial aliasing, 407n7 

spatial eigenfunction, 36 

spatial frequency, 31 

spectral resonance peaks, attenuation, 193-4 
sphere, modes of, 101-11 
spherical coordinates, 462-3 
axes, 465 

distance and azimuth, 463-5 
vector operators, 465-6 

spherical earth, ray paths and travel times, 157-9 

spherical harmonics, 103-4 

spherical waves, 55-6 

spheroidal modes, 106,109-10 

spinel, transition from olivine, 205-6, 311, 

316-18 

Spitak (Armenia) earthquake (1988), 12-13,15 
splitting, mode, 104,114-15 
SPO see shape-preferred orientation anisotropy 
spread, statistical, 393,420 
spreading centers, 286,298-307 
mid-ocean ridges and transforms, 288,298-9 
305-7 

oceanic lithosphere formation, 299-305 
stable sliding, 357 
stacking, 391-7 

Standard for the Exchange of Earthquake Data 
(SEED), 408 

standard linear solid, 196 
standing waves, 36,101,466 
starting model, 417 
static displacements, faults, 254-6 



static friction, 359-60 

static time correction, 145 

station corrections, 424 

steady state friction, 361 

stereographic projection, faults, 223-8 

stereonet, 223 

Stevenson, D., 174n9 

stick-slip earthquakes, 359-64 

stishovite, 205,208 

strain 

infinitesimal strain theory, 49 
recording of, 404 
seismic strain rate tensor, 341 
strain energy, 52,61-2 
strain tensor, 38,47-8 
see also stress 
strainmeters, 404,406 
Streckheisen seismometers, 404 
strength, of lithosphere, 357-9 
strength envelope, 357 
stress 

constitutive equations, 38,48-51 
deviatoric stresses, 45-6 
earthquake stress drop, 269-73 
elastic moduli, 49-50 
field, 45-7,345 

maximum stress difference, 356 
normal stress, 40,42,350-1,353, 357 
in oceanic lithosphere, 328-31 
principal stresses, 42-3 
and rock fracture, 350-4 
shear stresses, 40,43-5,350-1 
stress drop, 269-73 
stress and strain tensor, 50,177-8 
stress tensor, 38,39-42,350 
stress-strain curve, 349 
viscous relaxation of, 355-6 
yield stress, 349 
see also strain 
strike angle, 218 

strike-slip fault, 45,218,225-7,236,244,254-5, 
269,298-9, 328 
string waves 
calculation, 466-9 
harmonic waves, 31-2, 35-6 
normal modes, 36-8 
reflection and transmission, 32-5 
theory, 29-31 

strong-motion sensors, 404, 406 
STS-1 seismometer, 404 
STS-2 seismometer, 404 
subduction zones, 198,286, 307-25 
earthquakes, 307-8,309-10 
interplate trench earthquakes, 321-5 
subduction slab earthquakes, 312-21 
thermal models, 308-12 
summation convention, 449 
superadiabatic gradient, 201,209 
superposition, 34,219, 377 
surface force, 39 
surface wave magnitude, 264 
surface waves, 3, 86-93 
anisotropy, 182 
dispersion, 87, 93-100,433 
focal mechanisms, 235-9 
geometry, 87 
Lg waves, 197-8 
mantle waves, 251 
mode equivalence, 106-7,109 
radiation amplitudes, 236-7 
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waveform modeling, 235-9 
see also Rayleigh wave, Love wave, tsunami 
swarm, earthquake, 326, 328 
sweep signals, 149-52 
symmetric matrix, 458 

Synthetic Aperture Radar interferometry (InSAR), 
252-4 

synthetic seismogram calculation, 466-9 
synthetic seismograms, 111, 229 
synthetic waveforms see waveform modeling 
systematic errors, 391-2, 397 
systematic bias, 392-3 

t*, 196 

take-off angle, 222 

tangential motion, spheroidal modes, 110 
tangential traction, 43-4,116 
Tango, Japan, earthquake (1927), 254,256 
Tangshan, China, earthquake (1976), 12, 

26 

tau function, 121,138,159 
tau -p method, 137-40 
tectosphere, 170 
temperature 
in the earth, 202-4 
measuring variations, 186 
tensional stress, 40,51 
tensional (T) axis, 221,227-8 
tensor 

invariants, 43,457 
stress, 38-9 
strain, 38,47-8 
Tericiera Rift, 326 
tesseral hermonics, 104 
thermal boundary layer, 170,204,287 
thermal diffusivity, 300,309 
thermal isostasy, 301n3 
thermal lithosphere, 304 
thermal models, subduction, 308-12 
thrust earthquakes, 321-3, 340 
moment tensor, 250 

thrust fault, 45,218,225-7,236,244, 328, 

336 

Tibet, 336-7 

tides, solid earth, 373,400 
Tien Shan mountain belt, 336 
time series analysis, 369 
time-dependent behavior, 349-50, 355 
time-frequency domain equivalence, 229,235, 
373-4 

Tokyo earthquake (1923), 12,18-19,407 
tomography, 99,425 
attenuation, 198-9 
cross-borehole, 433 
nonlinear, 433 
whole-mantle, 433-4 
see also travel time tomography 
Tonga arc, 311 

Tonga subduction zone, 199,320 
toroidal modes, 104 
torsional modes, 104-6 
total reflection, 33 
total internal reflection, 67 
traction vector, 39 
normal traction, 44 
tangential traction, 44 
transfer function, 377,379 
transform faults, 286 
continental, 334 
spreading centers, 298,305-7 


transition zone, 162-3,171 
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velocities and density, 203 
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P-SV waves, 79-84 
SH waves, 76-8 
string waves, 33 
transverse component, 57 
transverse isotropy, 178-9,208 
transverse waves, 57 
travel time curves 
AK135,162 

earthquake location, 422-4 
IASP91,162-4 
inversion, 161-2 
Jeffreys-Bullen, 162 
PREM, 202-3 
travel time equation 
dipping layer, 125 
direct arrival, 120 
head wave, 121 
layered structure, 123 
reflected wave, 121 
in spherical earth, 159 
travel time tables, 162-4 
travel time tomography 
examples, 430-4 
inverse problem, 426-30 
subduction zones, 311 
theory, 424-6 
travel times 

intercept-slowness formulation, 137-40 
low-velocity zone, 160-1 
reflected waves, 134-7 
refracted waves, 121-2 
residuals, 420 

seismic waves, 60-1,119-20 
spherical earth, 157-9 
triplication, 160,171 
upper mantle, 171 
trenches, 288 
see also subduction zones 
triple vector dipole, 245 
triplication, 160 
core, 168 
upper mantle, 171 
Truckee earthquake (1966), 267 
tsunamis, 19-20,241,271 
dispersion, 99-101 
real-time warnings, 26 
seismic sources, 241,271 
two-station method, phase velocity, 97-8 

ultra-low-velocity zone (ULVZ), 174,208 
uncertainty principle, 7,382 
uniaxial tension, 51 
uniformitarianism, 341n4 
United States 

Advanced National Seismic System (ANSS), 
410 

earthquake forecasting, 21 
earthquake hazard map, 15,21 
earthquake risk, 13-14 
National Earthquake Hazards Reduction 
Program, 25 

National Seismographic Network (NSN), 
408 

regional networks, 410 
upper mantle see mantle 
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variance, 393-4,420 
variance-covariance matrix, 421 
vector calculus, 458-62 
curl, 460-1 
divergence, 459-60 
gradients, 459 
Laplacian, 461-2 
scalar and vector fields, 458-9 
vector dipole, 241 

vector potentials, layered medium, 63 
vector spherical harmonics, 104,106 
vector transformations, 454-8 
coordinate transformations, 455-6 
eigenvalues and eigenvectors, 456-8 
symmetric matrix, 458 
vectors 
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scalar product, 446 
spherical coordinates, 465-6 
vector fields, 458-9 
vector operations, 446 
vector products, 447-8 
vector spaces, 449-50 
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apparent velocity, 65-6 

dispersion of surface waves, 93-4, 97-9 

filtering, 145-7 

group, 94 

interval velocity, 136 
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P and S waves, 58-9 
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in spherical earth, 159-61 
subduction zones, 311 
velocity structure 
atMoho, 127-8,130 
of the earth, 119-20,162 
Venus, 211,287-8 
vertical slowness, 69,137 

Very Long Baseline Interferometry (VLBI), 251,296 

Vibroseis unit, sweep signals, 149-52,214 

viscoelastic material, 196, 355 

viscosity, of mantle, 204, 331-2, 350, 355-7 

viscous fluid models, 365 

volcanoes 

predicting eruptions, 20-1 
as seismic sources, 240-1,246 

Wadati, K., 288 

Wadati-Benioff zones, 288,307-8,312, 319 
wadsleyite, 205, 315 

Walvis ridge, seismic wave dispersion, 96-7 
water layer, 212,234 
wave equation 

one-dimensional, 30 
homogeneous plane wave, 54 
inhomogeneous plane wave, 55 
migration, 155 
spherical wave, 36 
wave field, 32 

downward or upward continuation, 155-6 
wave front 

body wave, 187 
plane, 55 
energy, 56,61-2 


spherical, 56 
surface wave, 187 
wave vectors, 55, 65 

horizontal component, 69 
vertical component, 69 
waveform annealing, 74 
waveform modeling, 171, 229-39 
body waves, 231-5 
source time function, 230-1 
surface waves, 235-9 
waveguides, 70, 321 
wavelength, 31-2 
wavenumber, 31-2 
Wegener, Alfred, 9,286, 295 
weighted damped least squares inversion, 

430 

weighted least squares solution, 430 
Whittier Narrows earthquake (1987), 363 
Wilson cycle, 333 
Wilson, J. Tuzo, 333nl 
window functions, 380-1 
Wood-Anderson seismograph, 263 
World Wide Standardized Seismographic Network 
(WWSSN), 26,288,398,407,408 
seismometers, 402-3 

Yellowstone hot spot, 334,347-8 
yield, explosion, 26-7 
yield stress, 349 
Young’s modulus, 51 

Zeeman effect, 115 
zero-offset section, 143,152,154 
zonal harmonics, 104 



This book is an introduction to seismology and its role in the earth sciences, and is written 
advanced undergraduate and beginning graduate students. 

The fundamentals of seismic wave propagation are developed using a physical approach 
and then applied to show how refraction, reflection, and teleseismic techniques are 
used to study the structure and thus the composition and evolution of the earth. 

The book shows how seismic waves are used to study earthquakes and are 
integrated with other data to investigate the plate tectonic processes that 
cause earthquakes. Figures, examples, problems, and computer exercises 
teach students about seismology in a creative and intuitive manner. 

Necessary mathematical tools including vector and tensor analysis, 
matrix algebra, Fourier analysis, statistics of errors, signal processing, 
and data inversion are introduced with many relevant examples. 

The text also addresses the fundamentals of seismometry and 
applications of seismology to societal issues. Special attention 
is paid to help students visualize connections between different 
topics and view seismology as an integrated science. 

An Introduction to Seismology, Earthquakes, a^d Earth 
Structure gives an excellent overview for students of 
geophysics and tectonics, and provides a strong foundation 
for further studies in seismology. 
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